Lexicap

link |

00:00:00.000

The following is a conversation with Marcus Hutter, Senior Research Scientist at Google DeepMind.

link |

00:00:06.000

Throughout his career of research, including with Jörgen Schmitthuber and Shane Legg,

link |

00:00:11.000

he has proposed a lot of interesting ideas in and around the field of artificial general intelligence,

link |

00:00:17.000

including the development of IEXI, spelled AIXI model,

link |

00:00:22.000

which is a mathematical approach to AGI that incorporates ideas of

link |

00:00:27.000

Kamogorov complexity, Solomanov induction, and reinforcement learning.

link |

00:00:33.000

In 2006, Marcus launched the 50,000 Euro Hutter Prize for Lossless Compression of Human Knowledge.

link |

00:00:41.000

The idea behind this prize is that the ability to compress well is closely related to intelligence.

link |

00:00:48.000

This, to me, is a profound idea.

link |

00:00:51.000

Specifically, if you can compress the first 100 megabytes or 1 gigabyte of Wikipedia better than your predecessors,

link |

00:00:58.000

your compressor likely has to also be smarter.

link |

00:01:02.000

The intention of this prize is to encourage the development of intelligent compressors as a path to AGI.

link |

00:01:09.000

In conjunction with his podcast release just a few days ago,

link |

00:01:13.000

Marcus announced a 10x increase in several aspects of this prize, including the money,

link |

00:01:19.000

to 500,000 Euros.

link |

00:01:22.000

The better your compressor works, relative to the previous winners, the higher fraction of that prize money is awarded to you.

link |

00:01:29.000

You can learn more about it if you Google simply Hutter Prize.

link |

00:01:34.000

I'm a big fan of benchmarks for developing AI systems,

link |

00:01:38.000

and the Hutter Prize may indeed be one that will spark some good ideas for approaches

link |

00:01:43.000

that will make progress on the path of developing AGI systems.

link |

00:01:47.000

This is the Artificial Intelligence Podcast.

link |

00:01:50.000

If you enjoy it, subscribe on YouTube, give it 5 stars on Apple Podcasts,

link |

00:01:54.000

support it on Patreon, or simply connect with me on Twitter at Lex Freedman, spelled F R I D M A N.

link |

00:02:02.000

As usual, I'll do one or two minutes of ads now and never any ads in the middle that can break the flow of the conversation.

link |

00:02:09.000

I hope that works for you and doesn't hurt the listening experience.

link |

00:02:13.000

This show is presented by Cash App, the number one finance app in the App Store.

link |

00:02:17.000

When you get it, use code LEX Podcast.

link |

00:02:21.000

Cash App lets you send money to friends, buy Bitcoin, and invest in the stock market with as little as $1.

link |

00:02:27.000

Brokerage services are provided by Cash App Investing, a subsidiary of Square, and member SIPC.

link |

00:02:34.000

Since Cash App allows you to send and receive money digitally, peer to peer, and security in all digital transactions is very important.

link |

00:02:42.000

Let me mention the PCI Data Security Standard that Cash App is compliant with.

link |

00:02:48.000

I'm a big fan of standards for safety and security.

link |

00:02:52.000

PCI DSS is a good example of that, where a bunch of competitors got together and agreed that there needs to be a global standard around the security of transactions.

link |

00:03:02.000

Now, we just need to do the same for autonomous vehicles and AI systems in general.

link |

00:03:08.000

So again, if you get Cash App from the App Store or Google Play and use the code LEX Podcast, you'll get $10.

link |

00:03:16.000

And Cash App will also donate $10 to FIRST, one of my favorite organizations that is helping to advance robotics and STEM education for young people around the world.

link |

00:03:27.000

And now, here's my conversation with Marcus Hutter.

link |

00:03:32.000

Do you think of the universe as a computer or maybe an information processing system?

link |

00:03:37.000

Let's go with a big question first.

link |

00:03:39.000

Okay, with a big question first.

link |

00:03:41.000

I think it's a very interesting hypothesis or idea.

link |

00:03:45.000

And I have a background in physics.

link |

00:03:48.000

So I know a little bit about physical theories, the standard model of particle physics and general relativity theory.

link |

00:03:54.000

And they are amazing and describe virtually everything in the universe.

link |

00:03:58.000

And they're all, in a sense, computable theories.

link |

00:04:00.000

I mean, they're very hard to compute.

link |

00:04:02.000

And it's very elegant, simple theories which describe virtually everything in the universe.

link |

00:04:07.000

So there's a strong indication that somehow the universe is computable.

link |

00:04:15.000

But it's a plausible hypothesis.

link |

00:04:17.000

So what do you think, just like you said, general relativity, quantum field theory,

link |

00:04:22.000

what do you think that the laws of physics are so nice and beautiful and simple and compressible?

link |

00:04:28.000

Do you think our universe was designed is naturally this way?

link |

00:04:34.000

Are we just focusing on the parts that are especially compressible?

link |

00:04:39.000

Are human minds just enjoy something about that simplicity?

link |

00:04:43.000

And in fact, there's other things that are not so compressible.

link |

00:04:46.000

No, I strongly believe and I'm pretty convinced that the universe is inherently beautiful, elegant and simple

link |

00:04:53.000

and described by these equations.

link |

00:04:55.000

And we're not just picking that.

link |

00:04:57.000

I mean, if there were some phenomena which cannot be neatly described, scientists would try that.

link |

00:05:04.000

And there's biology which is more messy, but we understand that it's an emergent phenomena.

link |

00:05:09.000

And it's complex systems, but they still follow the same rules of quantum and electrodynamics.

link |

00:05:14.000

All of chemistry follows that and we know that.

link |

00:05:16.000

I mean, we cannot compute everything because we have limited computational resources.

link |

00:05:20.000

No, I think it's not a bias of the humans, but it's objectively simple.

link |

00:05:24.000

I mean, of course, you never know, maybe there's some corners very far out in the universe

link |

00:05:28.000

or super, super tiny below the nucleus of atoms or parallel universes

link |

00:05:36.000

which are not nice and simple, but there's no evidence for that.

link |

00:05:40.000

And we should apply Occam's razor and choose the simple streak consistent with it.

link |

00:05:45.000

But although it's a little bit self referential.

link |

00:05:48.000

So maybe a quick pause.

link |

00:05:49.000

What is Occam's razor?

link |

00:05:51.000

So Occam's razor says that you should not multiply entities beyond necessity,

link |

00:05:57.000

which sort of if you translate it to proper English means and, you know,

link |

00:06:02.000

in the scientific context means that if you have two theories or hypothesis or models

link |

00:06:06.000

which equally well describe the phenomenon you're studying or the data,

link |

00:06:11.000

you should choose the more simple one.

link |

00:06:13.000

So that's just the principle?

link |

00:06:15.000

Yes.

link |

00:06:16.000

Sort of that's not like a provable law perhaps.

link |

00:06:20.000

We'll kind of discuss it and think about it.

link |

00:06:23.000

But what's the intuition of why the simpler answer is the one that is likely

link |

00:06:30.000

to be more correct descriptor of whatever we're talking about?

link |

00:06:34.000

I believe that Occam's razor is probably the most important principle in science.

link |

00:06:40.000

I mean, of course, we need logical deduction and we do experimental design.

link |

00:06:44.000

But science is about finding understanding the world, finding models of the world.

link |

00:06:51.000

And we can come up with crazy complex models which, you know,

link |

00:06:54.000

explain everything but predict nothing.

link |

00:06:56.000

But the simple model seem to have predictive power and it's a valid question.

link |

00:07:02.000

Why?

link |

00:07:03.000

And the two answers to that you can just accept it.

link |

00:07:07.000

That is the principle of science and we use this principle and it seems to be successful.

link |

00:07:12.000

We don't know why, but it just happens to be.

link |

00:07:15.000

Or you can try, you know, find another principle which explains Occam's razor.

link |

00:07:21.000

And if we start with assumption that the world is governed by simple rules,

link |

00:07:27.000

then there's a bias to our simplicity.

link |

00:07:31.000

And applying Occam's razor is the mechanism to finding these rules.

link |

00:07:37.000

And actually in a more quantitative sense and we come back to that later

link |

00:07:40.000

in terms of some of the deduction, you can rigorously prove that you assume

link |

00:07:44.000

that the world is simple, then Occam's razor is the best you can do in a certain sense.

link |

00:07:49.000

So I apologize for the romanticized question, but why do you think outside of its effectiveness,

link |

00:07:56.000

why do we, do you think we find simplicity so appealing as human beings?

link |

00:08:00.000

Why does it just, why does E equals MC squared seem so beautiful to us humans?

link |

00:08:08.000

I guess mostly, in general, many things can be explained by an evolutionary argument.

link |

00:08:15.000

And, you know, there's some artifacts in humans which, you know, are just artifacts

link |

00:08:19.000

and not an evolutionary necessary.

link |

00:08:21.000

But with this beauty and simplicity, it's, I believe, at least the core is about,

link |

00:08:31.000

like science, finding regularities in the world, understanding the world,

link |

00:08:36.000

which is necessary for survival, right?

link |

00:08:38.000

You know, if I look at a bush, right, and I just see noise and there is a tiger, right,

link |

00:08:44.000

and eats me, then I'm dead.

link |

00:08:45.000

But if I try to find a pattern and we know that humans are prone to find more patterns

link |

00:08:52.000

in data than they are, you know, like, you know, Mars face and all these things,

link |

00:08:57.000

but this bias towards finding patterns, even if they are not, but I mean,

link |

00:09:02.000

it's best, of course, if they are, yeah, helps us for survival.

link |

00:09:06.000

Yeah, that's fascinating.

link |

00:09:07.000

I haven't thought really about the, I thought I just loved science,

link |

00:09:11.000

but they're indeed from in terms of just for survival purposes.

link |

00:09:16.000

There is an evolutionary argument for why we find the work of Einstein so beautiful.

link |

00:09:24.000

Maybe a quick small tangent.

link |

00:09:26.000

Could you describe what Solomonov induction is?

link |

00:09:30.000

Yeah, so that's a theory which I claim and Resolominoff sort of claimed a long time ago

link |

00:09:37.000

that this solves the big philosophical problem of induction.

link |

00:09:42.000

And I believe the claim is essentially true.

link |

00:09:45.000

And what he does is the following.

link |

00:09:47.000

So, okay, for the picky listener induction can be interpreted narrowly and wildly narrow

link |

00:09:56.000

means inferring models from data and widely means also then using these models

link |

00:10:03.000

for doing predictions or predictions also part of the induction.

link |

00:10:06.000

So I'm a little sloppy sort of with the terminology and maybe that comes from Resolominoff,

link |

00:10:12.000

you know, being sloppy, maybe I shouldn't say that.

link |

00:10:15.000

He can't complain anymore.

link |

00:10:17.000

So let me explain a little bit this theory in simple terms.

link |

00:10:22.000

So assume you have a data sequence, make it very simple.

link |

00:10:25.000

The simplest one say one, one, one, one, one and you see if one hundred ones.

link |

00:10:28.000

What do you think comes next?

link |

00:10:30.000

The natural answer I'm going to speed up a little bit.

link |

00:10:32.000

The natural answer is of course, you know, one.

link |

00:10:35.000

Okay.

link |

00:10:36.000

And the question is why?

link |

00:10:37.000

Okay.

link |

00:10:38.000

Well, we see a pattern there.

link |

00:10:40.000

Yeah.

link |

00:10:41.000

Okay.

link |

00:10:42.000

There's a one and we repeat it.

link |

00:10:43.000

And why should it suddenly after a hundred ones be different?

link |

00:10:45.000

So what we're looking for is simple explanations or models for the data we have.

link |

00:10:50.000

And now the question is a model has to be presented in a certain language in which language to be used in science.

link |

00:10:58.000

We want formal languages and we can use mathematics or we can use programs on a computer.

link |

00:11:03.000

So abstractly on a Turing machine, for instance, or it can be a general purpose computer.

link |

00:11:08.000

So and there are of course lots of models of you can say maybe it's a hundred ones and then a hundred zeros and a hundred ones.

link |

00:11:14.000

That's a model, right?

link |

00:11:15.000

But they're simpler models.

link |

00:11:17.000

There's a model print one loop.

link |

00:11:19.000

That also explains the data.

link |

00:11:21.000

And if you push that to the extreme, you are looking for the shortest program, which if you run this program reproduces the data you have, it will not stop.

link |

00:11:32.000

It will continue naturally.

link |

00:11:34.000

And this you take for your prediction.

link |

00:11:36.000

And on the sequence of ones, it's very plausible, right?

link |

00:11:39.000

That print one loop is the shortest program.

link |

00:11:41.000

We can give some more complex examples like one, two, three, four, five.

link |

00:11:45.000

What comes next?

link |

00:11:46.000

The short program is again, you know, counter.

link |

00:11:48.000

And so that is roughly speaking how Solomotiv induction works.

link |

00:11:53.000

The extra twist is that it can also deal with noisy data.

link |

00:11:57.000

So if you have, for instance, a coin flip, say a biased coin, which comes up head with 60 percent probability, then it will predict.

link |

00:12:06.000

It will learn and figure this out.

link |

00:12:08.000

And after a while, it predicts all the next coin flip will be head with probability 60 percent.

link |

00:12:13.000

So it's the stochastic version of that.

link |

00:12:15.000

The goal is the dream is always the search for the short program.

link |

00:12:18.000

Yes.

link |

00:12:19.000

Yeah.

link |

00:12:20.000

Well, in Solomotiv induction, precisely what you do is so you combine.

link |

00:12:23.000

So looking for the shortest program is like applying opus razor, like looking for the simplest theory.

link |

00:12:29.000

There's also Epicoros principle, which says if you have multiple hypothesis, which equally well describe your data, don't discard any of them.

link |

00:12:36.000

Keep all of them around.

link |

00:12:37.000

You never know.

link |

00:12:38.000

And you can put that together and say, okay, I have a bias towards simplicity, but I don't rule out the larger models.

link |

00:12:45.000

And technically what we do is we weigh the shorter models higher and the longer models lower.

link |

00:12:52.000

And you use a Bayesian techniques.

link |

00:12:54.000

You have a prior and which is precisely two to the minus the complexity of the program.

link |

00:13:02.000

And you weigh all this hypothesis and takes this mixture and then you get also this stochasticity in.

link |

00:13:07.000

Yeah.

link |

00:13:08.000

Like many of your ideas, that's just a beautiful idea of weighing based on the simplicity of the program.

link |

00:13:13.000

I love that.

link |

00:13:14.000

That seems to me maybe a very human centric concept seems to be a very appealing way of discovering good programs in this world.

link |

00:13:24.000

You've used the term compression quite a bit.

link |

00:13:28.000

I think it's a beautiful idea.

link |

00:13:30.000

Sort of, we just talked about simplicity and maybe science or just all of our intellectual pursuits is basically the attempt to compress the complexity all around us into something simple.

link |

00:13:43.000

So what does this word mean to you?

link |

00:13:48.000

Compression.

link |

00:13:49.000

I essentially have already explained it.

link |

00:13:52.000

So it compression means for me finding short programs for the data or the phenomenon at hand.

link |

00:14:00.000

You could interpret it more widely as finding simple theories which can be mathematical theories or maybe even informal like just in words.

link |

00:14:09.000

Compression means finding short descriptions explanations programs for the data.

link |

00:14:15.000

Do you see science as a kind of our human attempt at compression?

link |

00:14:22.000

So we're speaking more generally because when you say programs, the kind of zooming in on a particular sort of almost like a computer science, artificial intelligence focus.

link |

00:14:30.000

But do you see all of human endeavor as a kind of compression?

link |

00:14:34.000

Well, at least all of science I see as an endeavor of compression, not all of humanity, maybe.

link |

00:14:40.000

And well, there are also some other aspects of science like experimental design, right?

link |

00:14:45.000

I mean, we create experiments specifically to get extra knowledge and this is, that isn't part of the decision making process.

link |

00:14:53.000

But once we have the data to understand the data is essentially compression.

link |

00:14:59.000

So I don't see any difference between compression, understanding and prediction.

link |

00:15:06.000

So we're jumping around topics a little bit, but returning back to simplicity, a fascinating concept of comagorov complexity.

link |

00:15:14.000

So in your sense, do most objects in our mathematical universe have high comagorov complexity and maybe what is, first of all, what is comagorov complexity?

link |

00:15:26.000

Okay, comagorov complexity is a notion of simplicity or complexity.

link |

00:15:31.000

And it takes the compression view to the extreme.

link |

00:15:36.000

So I explained before that if you have some data sequence, just think about a file on a computer and best sort of, you know, just a string of bits.

link |

00:15:45.000

And if you, and we have data compressors, like we compress big files into zip files with certain compressors.

link |

00:15:53.000

And you can also produce self extracting archives, that means as an executable, if you run it, it reproduces your original file without needing an extra decompressor.

link |

00:16:02.000

It's just a decompressor plus the archive together in one.

link |

00:16:06.000

And now they're better and worse compressors and you can ask what is the ultimate compressor.

link |

00:16:11.000

So what is the shortest possible self extracting archive you could produce for certain data set here, which reproduces the data set and the length of this is called the comagorov complexity.

link |

00:16:22.000

And arguably, that is the information content in the data set.

link |

00:16:27.000

I mean, if the data set is very redundant or very boring, you can compress it very well.

link |

00:16:31.000

So the information content should be low.

link |

00:16:34.000

And, you know, it is low according to this definition.

link |

00:16:36.000

So the length of the shortest program that summarizes the data.

link |

00:16:40.000

Yes.

link |

00:16:41.000

And what's your sense of our universe when we think about the different objects in our universe that we try concepts or whatever at every level.

link |

00:16:55.000

Do they have higher or low comagorov complexity?

link |

00:16:58.000

So what's the hope?

link |

00:17:00.000

Do we have a lot of hope in being able to summarize much of our world?

link |

00:17:05.000

That's a tricky and difficult question.

link |

00:17:08.000

So as I said before, I believe that the whole universe based on the evidence we have is very simple.

link |

00:17:16.000

So it has a very short description.

link |

00:17:18.000

So to linger on that, the whole universe, what does that mean?

link |

00:17:24.000

Do you mean at the very basic fundamental level in order to create the universe?

link |

00:17:28.000

Yes.

link |

00:17:29.000

Yeah.

link |

00:17:30.000

So you need a very short program and you run it to get the thing going to get the thing going and then it will reproduce our universe.

link |

00:17:37.000

There's a problem with noise.

link |

00:17:39.000

We can come back to that later, possibly.

link |

00:17:42.000

Is noise a problem or is it a bug or a feature?

link |

00:17:46.000

I would say it makes our life as a scientist really, really much harder.

link |

00:17:52.000

I mean, think about it without noise, we wouldn't need all of the statistics.

link |

00:17:56.000

But that maybe we wouldn't feel like there's a free will.

link |

00:17:59.000

Maybe we need that for the...

link |

00:18:01.000

This is an illusion that noise can give you free will.

link |

00:18:05.000

At least in that way, it's a feature.

link |

00:18:06.000

But also, if you don't have noise, you have chaotic phenomena which are effectively like noise.

link |

00:18:12.000

So we can't get away with statistics even then.

link |

00:18:15.000

I mean, think about rolling a dice and forget about quantum mechanics and you know exactly how you throw it.

link |

00:18:21.000

But I mean, it's still so hard to compute the trajectory that effectively it is best to model it as coming out with a number, with probability 1 over 6.

link |

00:18:31.000

But from this sort of philosophical...

link |

00:18:36.000

Complexly perspective, if we didn't have noise, then arguably you could describe the whole universe as standard model plus generativity.

link |

00:18:47.000

I mean, we don't have a theory of everything yet, but sort of assuming we are close to it or have it.

link |

00:18:52.000

Plus the initial conditions which may hopefully be simple.

link |

00:18:55.000

And then you just run it and then you would reproduce the universe.

link |

00:18:58.000

But that's spoiled by noise or by chaotic systems or by initial conditions which may be complex.

link |

00:19:06.000

So now, if we don't take the whole universe with just a subset, you know, just take planet Earth.

link |

00:19:13.000

Planet Earth cannot be compressed into a couple of equations.

link |

00:19:17.000

This is a hugely complex system.

link |

00:19:19.000

So interesting.

link |

00:19:20.000

So when you look at the window, like the whole thing might be simple, but when you just take a small window, then...

link |

00:19:26.000

It may become complex and that may be counterintuitive, but there's a very nice analogy.

link |

00:19:31.000

The book, the library of all books.

link |

00:19:34.000

So imagine you have a normal library with interesting books and you go there, great.

link |

00:19:38.000

Lots of information and huge, quite complex.

link |

00:19:41.000

So now I create a library which contains all possible books, say, of 500 pages.

link |

00:19:47.000

So the first book just has AAAA over all the pages.

link |

00:19:50.000

The next book AAAA and ends with B.

link |

00:19:52.000

And so on.

link |

00:19:53.000

I create this library of all books.

link |

00:19:55.000

It's a short program which creates this library.

link |

00:19:57.000

So this library which has all books has zero information content.

link |

00:20:01.000

And you take a subset of this library and suddenly you have a lot of information in there.

link |

00:20:05.000

So that's fascinating.

link |

00:20:06.000

I think one of the most beautiful object, mathematical objects that at least today seems to be understudied or under talked about is cellular automata.

link |

00:20:15.000

What lessons do you draw from sort of the game of life for cellular automata?

link |

00:20:19.000

Where you start with the simple rules just like you're describing with the universe and somehow complexity emerges.

link |

00:20:26.000

Do you feel like you have an intuitive grasp on the behavior, the fascinating behavior of such systems where some, like you said, some chaotic behavior could happen.

link |

00:20:37.000

Some complexity could emerge.

link |

00:20:39.000

Some, it could die out in some very rigid structures.

link |

00:20:43.000

Do you have a sense about cellular automata that somehow transfers maybe to the bigger questions of our universe?

link |

00:20:51.000

The cellular automata and especially the converse game of life is really great because this rule is so simple.

link |

00:20:56.000

You can explain it to every child and even by hand you can simulate a little bit.

link |

00:21:00.000

And you see this beautiful patterns emerge and people have proven that it's even touring complete.

link |

00:21:07.000

You can not just use a computer to simulate game of life, but you can also use game of life to simulate any computer.

link |

00:21:13.000

That is truly amazing.

link |

00:21:16.000

And it's the prime example probably to demonstrate that very simple rules can lead to very rich phenomena.

link |

00:21:25.000

And people, you know, sometimes, you know, how can, how is chemistry and biology so rich?

link |

00:21:30.000

I mean, this can't be based on simple rules.

link |

00:21:32.000

But no, we know quantum electrodynamics describes all of chemistry and we come later back to that.

link |

00:21:39.000

I claim intelligence can be explained or described in one single equation, this very rich phenomenon.

link |

00:21:45.000

You asked also about whether, you know, I understand this phenomenon and it's probably not.

link |

00:21:53.000

And this is saying you never understand really things, you just get used to them.

link |

00:21:58.000

And I think I'm pretty used to sell it out to Marty.

link |

00:22:03.000

So you believe that you understand now why this phenomenon happens.

link |

00:22:07.000

But I give you a different example.

link |

00:22:09.000

I didn't play too much with this converse game of life, but a little bit more with fractals and with the Mandelbrot set.

link |

00:22:16.000

And it's beautiful, you know, patterns just look Mandelbrot set.

link |

00:22:20.000

And, well, when the computers were really slow and I just had a black and white monitor and programmed my own programs in Assembler too.

link |

00:22:29.000

Assembler, wow.

link |

00:22:31.000

Wow, you're legit.

link |

00:22:33.000

To get these fractals on the screen and it was mesmerized and much later.

link |

00:22:37.000

So I returned to this, you know, every couple of years and then I tried to understand what is going on and you can understand a little bit.

link |

00:22:45.000

I tried to derive the locations, you know, there are these circles and the apple shape.

link |

00:22:53.000

And then you have smaller Mandelbrot sets recursively in this set.

link |

00:22:59.000

And there's a way to mathematically by solving high order polynomials to figure out where these centers are and what size they are approximately.

link |

00:23:08.000

And by sort of mathematically approaching this problem, you slowly get a feeling of why things are like they are.

link |

00:23:18.000

And that sort of isn't, you know, first step to understanding why this rich phenomenon.

link |

00:23:25.000

Do you think it's possible? What's your intuition?

link |

00:23:27.000

Do you think it's possible to reverse engineer and find the short program that generated these fractals by looking at the fractals?

link |

00:23:36.000

Well, in principle, yes.

link |

00:23:38.000

So, I mean, in principle, what you can do is you take, you know, any data set, you know, you take these fractals or you take whatever your data set, whatever you have.

link |

00:23:47.000

It's a picture of Converse Game of Life.

link |

00:23:50.000

And you run through all programs, you take a program of size one, two, three, four and all these programs around them all in parallel in so called dovetailing fashion.

link |

00:23:58.000

Give them computational resources first one 50%, second one half resources and so on and let them run.

link |

00:24:05.000

Wait until they hold, give an output, compare it to your data.

link |

00:24:09.000

And if some of these programs produce the correct data, then you stop and then you have already some program.

link |

00:24:14.000

It may be a long program because it's faster.

link |

00:24:16.000

And then you continue and you get shorter and shorter programs until you eventually find the shortest program.

link |

00:24:22.000

The interesting thing you can never know whether it's the shortest program because there could be an even shorter program which is just even slower.

link |

00:24:30.000

And you just have to wait here, but asymptotically, and actually after a finite time you have the shortest program.

link |

00:24:36.000

So this is a theoretical but completely impractical way of finding the underlying structure in every data set and then it was a lot more of induction does and come a lot of complexity.

link |

00:24:50.000

In practice, of course, we have to approach the problem more intelligently.

link |

00:24:53.000

And then if you take resource limitations into account, there's, for instance, the field of pseudo random numbers.

link |

00:25:03.000

And these are random numbers.

link |

00:25:05.000

So these are deterministic sequences, but no algorithm which is fast, fast means runs in polynomial time can detect that it's actually deterministic.

link |

00:25:15.000

So we can produce interesting, I mean, random numbers, maybe not that interesting, but just an example.

link |

00:25:20.000

We can produce complex looking data and we can then prove that no fast algorithm can detect the underlying pattern.

link |

00:25:31.000

Which is unfortunately, that's a big challenge for our search for simple programs in the space of artificial intelligence, perhaps.

link |

00:25:42.000

Yes, it definitely is wanted vision intelligence and it's quite surprising that it's, I can't say easy.

link |

00:25:48.000

I mean, physicists worked really hard to find these theories, but apparently it was possible for human minds to find these simple rules in the universe.

link |

00:25:57.000

It could have been different, right?

link |

00:25:59.000

It could have been different.

link |

00:26:00.000

It's awe inspiring.

link |

00:26:04.000

So let me ask another absurdly big question.

link |

00:26:08.000

What is intelligence in your view?

link |

00:26:13.000

So I have, of course, a definition.

link |

00:26:17.000

I wasn't sure what you're going to say, because you could have just as easy said, I have no clue.

link |

00:26:21.000

Which many people would say, but I'm not modest in this question.

link |

00:26:27.000

So the informal version, which I worked out together with Shane Lake, who co founded the mind is that intelligence measures and agents ability to perform well in a wide range of environments.

link |

00:26:43.000

So that doesn't sound very impressive.

link |

00:26:47.000

But these words have been very carefully chosen and there is a mathematical theory behind that.

link |

00:26:53.000

And we come back to that later.

link |

00:26:55.000

And if you look at this definition by itself, it seems like, yeah, okay, but it seems a lot of things are missing.

link |

00:27:03.000

But if you think it through, then you realize that most and I claim all of the other traits, at least of rational intelligence, which we usually associate with intelligence, are emergent phenomena from this definition.

link |

00:27:18.000

Like, you know, creativity, memorization, planning, knowledge.

link |

00:27:22.000

You all need that in order to perform well in a wide range of environments.

link |

00:27:27.000

So you don't have to explicitly mention that in a definition.

link |

00:27:30.000

Interesting.

link |

00:27:31.000

So yeah, so the consciousness, abstract reasoning, all these kinds of things are just emergent phenomena that help you in towards, can you say the definition again?

link |

00:27:42.000

So multiple environments.

link |

00:27:44.000

Did you mention the word goals?

link |

00:27:46.000

No, but we have an alternative definition instead of performing well, you can just replace it by goals.

link |

00:27:51.000

So intelligence measures and agents ability to achieve goals in a wide range of environments.

link |

00:27:56.000

That's more or less equal.

link |

00:27:57.000

Interesting, because in there there's an injection of the word goals, so we want to specify there should be a goal.

link |

00:28:03.000

Yeah, but perform well is sort of what does it mean?

link |

00:28:06.000

It's the same problem.

link |

00:28:07.000

Yeah.

link |

00:28:08.000

There's a little bit gray area, but it's much closer to something that could be formalized.

link |

00:28:13.000

In your view, are humans, where do humans fit into that definition?

link |

00:28:18.000

Are they general intelligence systems that are able to perform in like, how good are they at fulfilling that definition at performing well in multiple environments?

link |

00:28:31.000

Yeah, that's a big question.

link |

00:28:33.000

I mean, the humans are performing best among all species.

link |

00:28:37.000

Species we know, we know of.

link |

00:28:40.000

Depends.

link |

00:28:41.000

You could say that trees and plants are doing a better job.

link |

00:28:44.000

They'll probably outlast us.

link |

00:28:46.000

Yeah, but they're in a much more narrow environment, right?

link |

00:28:49.000

I mean, you just have a little bit of air pollution and these trees die, and we can adapt, right?

link |

00:28:54.000

We build houses, we build filters, we do geoengineering.

link |

00:28:59.000

So the multiple environment part.

link |

00:29:01.000

Yeah, that is very important.

link |

00:29:02.000

Yeah.

link |

00:29:03.000

So they distinguish narrow intelligence from wide intelligence, also in the AI research.

link |

00:29:08.000

So let me ask the alentoring question, can machines think, can machines be intelligent?

link |

00:29:16.000

So in your view, I have to kind of ask, the answer is probably yes, but I want to kind of hear with your thoughts on it.

link |

00:29:24.000

Can machines be made to fulfill this definition of intelligence, to achieve intelligence?

link |

00:29:30.000

Well, we are sort of getting there and, you know, on a small scale, we are already there.

link |

00:29:36.000

The wide range of environments are missing.

link |

00:29:39.000

But we have self driving cars, we have programs to play go and chess, we have speech recognition.

link |

00:29:44.000

So it's pretty amazing, but you can, you know, these are narrow environments.

link |

00:29:49.000

But if you look at AlphaZero, that was also developed by DeepMind.

link |

00:29:54.000

I mean, got famous with AlphaGo and then came AlphaZero a year later.

link |

00:29:57.000

That was truly amazing.

link |

00:29:59.000

So I'm reinforcement learning algorithm, which is able just by self play to play chess and then also go.

link |

00:30:08.000

And I mean, yes, they're both games, but they're quite different games.

link |

00:30:11.000

And, you know, this, you didn't don't feed them the rules of the game.

link |

00:30:15.000

And the most remarkable thing, which is still a mystery to me that usually for any decent chess program, I don't know much about Go,

link |

00:30:22.000

you need opening books and end game tables and so on too. And nothing in there, nothing was put in there.

link |

00:30:29.000

Especially with AlphaZero, the self play mechanism starting from scratch, being able to learn actually new strategies.

link |

00:30:38.000

Yeah, it really discovered, you know, all these famous openings within four hours by itself.

link |

00:30:46.000

What I was really happy about, I'm a terrible chess player, but I like Queen Gambi.

link |

00:30:50.000

And AlphaZero figured out that this is the best opening.

link |

00:30:54.000

Finally, somebody proved you correct.

link |

00:30:59.000

So yes, to answer your question, yes, I believe that general intelligence is possible.

link |

00:31:04.000

And it also depends how you define it.

link |

00:31:08.000

Do you say AGI with general intelligence, artificial intelligence only refers to if you achieve human level or a sub human level,

link |

00:31:17.000

but quite broad, is it also general intelligence or we have to distinguish or it's only super human intelligence, general artificial intelligence?

link |

00:31:25.000

Is there a test in your mind like the Turing test for natural language or some other test that would impress the heck out of you

link |

00:31:32.000

that would kind of cross the line of your sense of intelligence within the framework that you said?

link |

00:31:40.000

Well, the Turing test has been criticized a lot, but I think it's not as bad as some people think.

link |

00:31:46.000

Some people think it's too strong, so it tests not just for a system to be intelligent,

link |

00:31:52.000

but it also has to fake human deception.

link |

00:31:56.000

Disception, which is much harder.

link |

00:31:59.000

And on the other hand, they say it's too weak because it just maybe fakes emotions or intelligent behavior.

link |

00:32:07.000

It's not real, but I don't think that's the problem or big problem.

link |

00:32:12.000

So if you would pass the Turing test, so a conversation or a terminal with a bot for an hour,

link |

00:32:20.000

or maybe a day or so, and you can fool a human into not knowing whether this is a human or not,

link |

00:32:26.000

so that's the Turing test, I would be truly impressed.

link |

00:32:30.000

And we have these annual competitions, the Lubna Prize.

link |

00:32:34.000

And I mean, it started with Eliza, that was the first conversational program.

link |

00:32:38.000

And what is it called in Japanese, Mitsuko or so, that's the winner of the last couple of years.

link |

00:32:44.000

It's quite impressive.

link |

00:32:46.000

Yeah, it's quite impressive.

link |

00:32:47.000

And then Google has developed Mina, right?

link |

00:32:50.000

Just recently, that's an open domain conversational bot, just a couple of weeks ago, I think.

link |

00:32:57.000

Yeah, I kind of like the metric that sort of the Alexa Prize has proposed.

link |

00:33:01.000

I mean, maybe it's obvious to you, it wasn't to me of setting sort of a length of a conversation.

link |

00:33:07.000

You want the bot to be sufficiently interesting that you'd want to keep talking to it for like 20 minutes.

link |

00:33:13.000

And that's a surprisingly effective and aggregate metric.

link |

00:33:19.000

Because nobody has the patience to be able to talk to a bot that's not interesting

link |

00:33:27.000

and intelligent and witty and is able to go into different tangents, jump domains,

link |

00:33:32.000

be able to say something interesting to maintain your attention.

link |

00:33:36.000

Maybe many humans will also fail this test.

link |

00:33:39.000

Unfortunately, we set, just like with autonomous vehicles with chatbots,

link |

00:33:45.000

we also set a bar that's way too hard to reach.

link |

00:33:48.000

I said the Turing test is not as bad as some people believe.

link |

00:33:51.000

But what is really not useful about the Turing test, it gives us no guidance

link |

00:33:57.000

how to develop these systems in the first place.

link |

00:34:00.000

Of course, we can develop them by trial and error and do whatever and then run the test

link |

00:34:05.000

and see whether it works or not.

link |

00:34:07.000

But a mathematical definition of intelligence gives us an objective

link |

00:34:16.000

which we can then analyze by theoretical tools or computational

link |

00:34:21.000

and maybe even prove how close we are.

link |

00:34:25.000

And we will come back to that later with the ICSE model.

link |

00:34:29.000

I mentioned the compression, right?

link |

00:34:31.000

So in language processing, they have achieved amazing results.

link |

00:34:36.000

And one way to test this, of course, you take the system, you train it

link |

00:34:40.000

and then you see how well it performs on the task.

link |

00:34:43.000

But a lot of performance measurement is done by so called perplexity,

link |

00:34:49.000

which is essentially the same as complexity or compression length.

link |

00:34:53.000

So the NLP community develops new systems and then they measure the compression length

link |

00:34:57.000

and then they have ranking and leaks because there's a strong correlation

link |

00:35:02.000

between compressing well and then the system performing well at the task at hand.

link |

00:35:07.000

It's not perfect, but it's good enough for them as an intermediate aim.

link |

00:35:14.000

So you mean measure, so this is kind of almost returning to the common growth complexity.

link |

00:35:19.000

So you're saying good compression usually means good intelligence.

link |

00:35:24.000

Yes.

link |

00:35:26.000

So you mentioned you're one of the only people who dared boldly

link |

00:35:33.000

to try to formalize the idea of artificial general intelligence,

link |

00:35:38.000

to have a mathematical framework for intelligence,

link |

00:35:42.000

just like as we mentioned, termed IXE, A I X I.

link |

00:35:49.000

So let me ask the basic question, what is IXE?

link |

00:35:54.000

Okay, so let me first say what it stands for.

link |

00:35:58.000

What it stands for, actually, that's probably the more basic question.

link |

00:36:01.000

The first question is usually how it's pronounced,

link |

00:36:04.000

but finally I put it on the website, how it's pronounced.

link |

00:36:07.000

You figured it out.

link |

00:36:10.000

The name comes from AI, artificial intelligence,

link |

00:36:13.000

and the X I is the Greek letter XI,

link |

00:36:16.000

which are used for Solomonov's distribution for quite stupid reasons,

link |

00:36:22.000

which I'm not willing to repeat here in front of camera.

link |

00:36:27.000

So it just happened to be more or less arbitrary, I chose the XI.

link |

00:36:31.000

But it also has nice other interpretations.

link |

00:36:35.000

So there are actions and perceptions in this model,

link |

00:36:38.000

where an agent has actions and perceptions, and over time.

link |

00:36:42.000

So this is A index I, X index I.

link |

00:36:45.000

So there's an action at time I, and then followed by a perception at time I.

link |

00:36:49.000

We'll go with that. I'll edit out the first part.

link |

00:36:52.000

I'm just kidding.

link |

00:36:53.000

I have some more interpretations.

link |

00:36:55.000

So at some point, maybe five years ago or 10 years ago,

link |

00:36:59.000

I discovered in Barcelona, it was on a big church.

link |

00:37:04.000

There was a stone engraved, some text,

link |

00:37:08.000

and the word IXE appeared there a couple of times.

link |

00:37:12.000

I was very surprised and happy about that.

link |

00:37:17.000

And I looked it up, so it is Catalan language,

link |

00:37:20.000

and it means with some interpretation,

link |

00:37:22.000

that's it, that's the right thing to do.

link |

00:37:25.000

So it's almost like destined, somehow came to you in a dream.

link |

00:37:32.000

And similar, there's a Chinese word, IXE, also written like IXE,

link |

00:37:35.000

if you transcribe it to Pingen.

link |

00:37:37.000

And the final one is that is AI, crossed with induction,

link |

00:37:41.000

because that is, and it's going more to the content now.

link |

00:37:44.000

So good old fashioned AI is more about planning

link |

00:37:47.000

and known deterministic world,

link |

00:37:49.000

and induction is more about, often, you know,

link |

00:37:51.000

IID data and inferring models,

link |

00:37:53.000

and essentially what this IXE model does is combine these two.

link |

00:37:57.000

And I actually also recently, I think,

link |

00:37:59.000

heard that in Japanese, AI means love.

link |

00:38:02.000

So if you can combine XI somehow with that,

link |

00:38:06.000

I think we can, there might be some interesting ideas there.

link |

00:38:10.000

So IXE, let's then take the next step.

link |

00:38:13.000

So maybe talk at the big level of what is this mathematical framework.

link |

00:38:20.000

Yeah, so it consists essentially of two parts.

link |

00:38:23.000

One is the learning and induction and prediction part,

link |

00:38:27.000

and the other one is the planning part.

link |

00:38:29.000

So let's come first to the learning induction prediction part,

link |

00:38:33.000

which essentially I explained already before.

link |

00:38:36.000

So what we need for any agent to act well

link |

00:38:41.000

is that it can somehow predict what happens.

link |

00:38:44.000

I mean, if you have no idea what your actions do,

link |

00:38:47.000

how can you decide which actions are good or not?

link |

00:38:49.000

So you need to have some model of what your actions effect.

link |

00:38:53.000

So what you do is you have some experience.

link |

00:38:56.000

You build models like scientists, you know, of your experience.

link |

00:38:59.000

Then you hope these models are roughly correct,

link |

00:39:01.000

and then you use these models for prediction.

link |

00:39:04.000

And a model is, sorry, to interrupt,

link |

00:39:06.000

and a model is based on your perception of the world,

link |

00:39:08.000

how your actions will affect that world.

link |

00:39:10.000

That's not the important part.

link |

00:39:14.000

It is technically important,

link |

00:39:16.000

but at this stage we can just think about predicting,

link |

00:39:18.000

say, stock market data,

link |

00:39:20.000

whether data or IQ sequences,

link |

00:39:22.000

one, two, three, four, five, what comes next, yeah?

link |

00:39:24.000

So of course our actions affect what we're doing,

link |

00:39:28.000

but I'll come back to that in a second.

link |

00:39:30.000

And I'll keep just interrupting.

link |

00:39:32.000

So just to draw a line between prediction and planning,

link |

00:39:36.000

what do you mean by prediction in this way?

link |

00:39:40.000

It's trying to predict the environment

link |

00:39:43.000

without your long term action in the environment.

link |

00:39:46.000

What is prediction?

link |

00:39:48.000

Okay, if you want to put the actions in now,

link |

00:39:50.000

okay, then let's put it in now, yeah?

link |

00:39:53.000

We don't have to put them now.

link |

00:39:55.000

Scratch a dumb question.

link |

00:39:57.000

Okay, so the simplest form of prediction is

link |

00:40:00.000

that you just have data which you passively observe,

link |

00:40:04.000

and you want to predict what happens without interfering.

link |

00:40:08.000

As I said, weather forecasting, stock market, IQ sequences,

link |

00:40:12.000

or just anything, okay?

link |

00:40:16.000

And Solominov's theory of induction based on compression,

link |

00:40:19.000

so you look for the shortest program

link |

00:40:21.000

which describes your data sequence,

link |

00:40:23.000

and then you take this program, run it,

link |

00:40:25.000

which reproduces your data sequence by definition,

link |

00:40:27.000

and then you let it continue running,

link |

00:40:29.000

and then it will produce some predictions,

link |

00:40:31.000

and you can rigorously prove that for any prediction task,

link |

00:40:37.000

this is essentially the best possible predictor.

link |

00:40:40.000

Of course, if there's a prediction task,

link |

00:40:43.000

or a task which is unpredictable, like, you know,

link |

00:40:46.000

you have fair coin flips, yeah?

link |

00:40:48.000

I cannot predict the next fair coin flip.

link |

00:40:50.000

What Solominov does is says, okay, next head is probably 50%.

link |

00:40:52.000

It's the best you can do.

link |

00:40:54.000

So if something is unpredictable, Solominov will also not

link |

00:40:56.000

magically predict it, but if there is some pattern

link |

00:40:59.000

of probability, then Solominov induction

link |

00:41:01.000

will figure that out eventually,

link |

00:41:04.000

and not just eventually, but rather quickly,

link |

00:41:06.000

and you can have proof convergence rates,

link |

00:41:10.000

whatever your data is.

link |

00:41:12.000

So there's pure magic in a sense.

link |

00:41:15.000

What's the catch?

link |

00:41:16.000

Well, the catch is that it's not computable,

link |

00:41:17.000

and we come back to that later.

link |

00:41:19.000

You cannot just implement it,

link |

00:41:20.000

even with Google resources here,

link |

00:41:22.000

and run it and, you know, predict the stock market

link |

00:41:24.000

and become rich.

link |

00:41:25.000

I mean, if...

link |

00:41:26.000

You know, try it at the time.

link |

00:41:29.000

So the basic task is you're in the environment,

link |

00:41:31.000

and you're interacting with the environment

link |

00:41:33.000

to try to learn a model of that environment,

link |

00:41:35.000

and the model is in the space of all these programs,

link |

00:41:38.000

and your goal is to get a bunch of programs that are simple.

link |

00:41:41.000

And so let's go to the actions now.

link |

00:41:44.000

But actually, good that you asked.

link |

00:41:45.000

Usually, I skipped this part,

link |

00:41:46.000

although there is also a minor contribution,

link |

00:41:48.000

which I did, so the action part,

link |

00:41:49.000

but I usually sort of just jump to the decision part.

link |

00:41:51.000

So let me explain to the action part now.

link |

00:41:53.000

Thanks for asking.

link |

00:41:55.000

So you have to modify it a little bit

link |

00:41:58.000

by now not just predicting a sequence

link |

00:42:01.000

which just comes to you,

link |

00:42:03.000

but you have an observation, then you act somehow,

link |

00:42:06.000

and then you want to predict the next observation

link |

00:42:09.000

based on the past observation and your action.

link |

00:42:12.000

Then you take the next action.

link |

00:42:14.000

You don't care about predicting it because you're doing it.

link |

00:42:17.000

And then you get the next observation,

link |

00:42:19.000

and you want...

link |

00:42:20.000

Well, before you get it, you want to predict it again

link |

00:42:22.000

based on your past action and observation sequence.

link |

00:42:24.000

You just condition extra on your actions.

link |

00:42:28.000

There's an interesting alternative

link |

00:42:30.000

that you also try to predict your own actions.

link |

00:42:35.000

If you want...

link |

00:42:36.000

In the past or the future?

link |

00:42:38.000

Your future actions.

link |

00:42:39.000

That's interesting.

link |

00:42:41.000

Wait, let me wrap.

link |

00:42:43.000

I think my brain just broke.

link |

00:42:45.000

We should maybe discuss that later

link |

00:42:47.000

after I've explained the ICSE model.

link |

00:42:48.000

That's an interesting variation.

link |

00:42:50.000

But that is a really interesting variation.

link |

00:42:52.000

And a quick comment.

link |

00:42:54.000

I don't know if you want to insert that in here,

link |

00:42:56.000

but you're looking at the...

link |

00:42:58.000

In terms of observations,

link |

00:43:00.000

you're looking at the entire big history,

link |

00:43:02.000

the long history of the observations.

link |

00:43:04.000

That's very important, the whole history

link |

00:43:06.000

from birth sort of of the agent.

link |

00:43:08.000

And we can come back to that also while this is important here.

link |

00:43:11.000

Often, you know, in RL, you have MDPs,

link |

00:43:14.000

macro decision processes, which are much more limiting.

link |

00:43:16.000

Okay, so now we can predict conditioned on actions.

link |

00:43:20.000

So even if the influence environment.

link |

00:43:22.000

But prediction is not all we want to do, right?

link |

00:43:24.000

We also want to act really in the world.

link |

00:43:26.000

And the question is how to choose the actions.

link |

00:43:29.000

And we don't want to greedily choose the actions.

link |

00:43:32.000

You know, just, you know, what is best in the next time step.

link |

00:43:36.000

And we first, I should say, you know,

link |

00:43:38.000

what is, you know, how do we measure performance?

link |

00:43:40.000

So we measure performance by giving the agent reward.

link |

00:43:43.000

That's the so called reinforcement learning framework.

link |

00:43:45.000

So every time step, you can give it a positive reward

link |

00:43:48.000

or negative reward or maybe no reward.

link |

00:43:50.000

It could be a very scarce, right?

link |

00:43:52.000

Like if you play chess just at the end of the game,

link |

00:43:54.000

you give plus one for winning or minus one for losing.

link |

00:43:57.000

So in the IXE framework, that's completely sufficient.

link |

00:43:59.000

So occasionally you give a reward signal

link |

00:44:01.000

and you ask the agent to maximize reward,

link |

00:44:04.000

but not greedily sort of, you know, the next one,

link |

00:44:06.000

next one because that's very bad in the long run

link |

00:44:08.000

if you're greedy.

link |

00:44:10.000

So, but over the lifetime of the agent.

link |

00:44:12.000

So let's assume the agent lives for M timestamps.

link |

00:44:14.000

That will say dies in sort of 100 years sharp.

link |

00:44:17.000

That's just, you know, the simplest model to explain.

link |

00:44:19.000

So it looks at the future reward sum and ask,

link |

00:44:23.000

what is my action sequence?

link |

00:44:25.000

Well, actually more precisely my policy,

link |

00:44:27.000

which leads in expectation because I don't know the world

link |

00:44:31.000

to the maximum reward sum.

link |

00:44:34.000

Let me give you an analogy.

link |

00:44:36.000

In chess, for instance, we know how to play optimally in theory.

link |

00:44:40.000

It's just a mini max strategy.

link |

00:44:42.000

I play the move which seems best to me under the assumption

link |

00:44:45.000

that the opponent plays the move which is best for him.

link |

00:44:48.000

So best, so worst for me under the assumption that he,

link |

00:44:51.000

I play again the best move.

link |

00:44:54.000

And then you have this expecting max tree to the end of the game.

link |

00:44:57.000

And then you back propagate and then you get the best possible move.

link |

00:45:00.000

So that is the optimal strategy,

link |

00:45:02.000

which for Norman already figured out a long time ago

link |

00:45:05.000

for playing adversarial games.

link |

00:45:08.000

Luckily, or maybe unluckily for the theory,

link |

00:45:11.000

it becomes harder that world is not always adversarial.

link |

00:45:14.000

So it can be, if the other humans even cooperative,

link |

00:45:18.000

or nature is usually, I mean the dead nature is stochastic.

link |

00:45:22.000

Things just happen randomly or don't care about you.

link |

00:45:26.000

So what you have to take into account is the noise

link |

00:45:29.000

and not necessarily adversariality.

link |

00:45:31.000

So you replace the minimum on the opponent's side

link |

00:45:34.000

by an expectation, which is general enough to include

link |

00:45:37.000

also adversarial cases.

link |

00:45:40.000

So now instead of a mini max strategy,

link |

00:45:42.000

expect a max strategy.

link |

00:45:44.000

So far, so good.

link |

00:45:45.000

So that is well known.

link |

00:45:46.000

It's called sequential decision theory.

link |

00:45:48.000

But the question is on which probability distribution

link |

00:45:51.000

do you base that?

link |

00:45:53.000

If I have the true probability distribution,

link |

00:45:55.000

like say I play beggining, right?

link |

00:45:57.000

There's dice and there's certain randomness involved.

link |

00:46:00.000

I can calculate probabilities and feed it in the expected max

link |

00:46:03.000

or the sequential decision tree come up with the optimal decision

link |

00:46:06.000

if I have enough compute.

link |

00:46:08.000

But for the real world, we don't know that.

link |

00:46:10.000

What is the probability driver in front of me breaks?

link |

00:46:14.000

I don't know.

link |

00:46:15.000

So it depends on all kinds of things

link |

00:46:17.000

and especially new situations.

link |

00:46:19.000

I don't know.

link |

00:46:20.000

So this is this unknown thing about prediction

link |

00:46:23.000

and there's where Solomanov comes in.

link |

00:46:25.000

So what you do is in sequential decision tree,

link |

00:46:27.000

you just replace the true distribution,

link |

00:46:29.000

which we don't know by this universal distribution.

link |

00:46:33.000

I didn't explicitly talk about it,

link |

00:46:35.000

but this is used for universal prediction

link |

00:46:37.000

and you plug it into the sequential decision tree mechanism.

link |

00:46:40.000

And then you get the best of both worlds.

link |

00:46:42.000

You have a long term planning agent,

link |

00:46:45.000

but it doesn't need to know anything about the world

link |

00:46:48.000

because the Solomanov induction part learns.

link |

00:46:51.000

Can you explicitly try to describe the universal distribution

link |

00:46:56.000

and how Solomanov induction plays a role here?

link |

00:47:00.000

I'm trying to understand.

link |

00:47:01.000

So what he does it, so in the simplest case,

link |

00:47:04.000

he said take the shortest program describing your data, run it,

link |

00:47:07.000

have a prediction which would be deterministic.

link |

00:47:09.000

Yes.

link |

00:47:10.000

Okay.

link |

00:47:11.000

But you should not just take the shortest program,

link |

00:47:13.000

but also consider the longer ones,

link |

00:47:15.000

but give it lower a priori probability.

link |

00:47:18.000

So in the Bayesian framework,

link |

00:47:20.000

you say a priori, any distribution,

link |

00:47:25.000

which is a model or a stochastic program,

link |

00:47:29.000

has a certain a priori probability,

link |

00:47:31.000

which is two to the minus and y to the minus length,

link |

00:47:34.000

I could explain, length of this program.

link |

00:47:36.000

So longer programs are punished, a priori.

link |

00:47:40.000

And then you multiply it with the so called likelihood function,

link |

00:47:44.000

which is, as the name suggests,

link |

00:47:47.000

is how likely is this model given the data at hand.

link |

00:47:51.000

So if you have a very wrong model,

link |

00:47:53.000

it's very unlikely that this model is true.

link |

00:47:55.000

And so it is very small number.

link |

00:47:57.000

So even if the model is simple, it gets penalized by that.

link |

00:48:00.000

And what you do is then you take just the sum,

link |

00:48:02.000

but this is the average over it.

link |

00:48:04.000

And this gives you a probability distribution.

link |

00:48:07.000

So it's a universal distribution,

link |

00:48:09.000

also a moment of distribution.

link |

00:48:10.000

So it's weighed by the simplicity of the program

link |

00:48:13.000

and the likelihood.

link |

00:48:14.000

Yes.

link |

00:48:15.000

It's kind of a nice idea.

link |

00:48:17.000

Yeah.

link |

00:48:18.000

So okay.

link |

00:48:19.000

And then you said there's,

link |

00:48:21.000

you're playing N or M or forgot the letter,

link |

00:48:24.000

steps into the future.

link |

00:48:26.000

So how difficult is that problem?

link |

00:48:28.000

What's involved there?

link |

00:48:29.000

Okay.

link |

00:48:30.000

It's a basic optimization problem.

link |

00:48:31.000

What are we talking about?

link |

00:48:32.000

Yeah.

link |

00:48:33.000

So you have a planning problem up to horizon M

link |

00:48:35.000

and that's exponential time in the horizon M,

link |

00:48:38.000

which is, I mean, it's computable, but intractable.

link |

00:48:41.000

I mean, even for chess, it's already intractable

link |

00:48:43.000

to do that exactly.

link |

00:48:44.000

And, you know, for though,

link |

00:48:45.000

but it could be also discounted kind of framework.

link |

00:48:48.000

Yeah.

link |

00:48:49.000

So, so having a hard horizon, you know,

link |

00:48:52.000

at 100 years, it's just for simplicity

link |

00:48:54.000

of discussing the model and also sometimes the master simple.

link |

00:48:58.000

Um, but there are lots of variations.

link |

00:49:00.000

Actually quite interesting parameter.

link |

00:49:02.000

It's, it's, there's nothing really problematic about it,

link |

00:49:07.000

but it's very interesting.

link |

00:49:08.000

So for instance, you think, no, let's, let's tend,

link |

00:49:10.000

let's let the parameter M tend to infinity.

link |

00:49:12.000

Right.

link |

00:49:13.000

You want an agent which lives forever.

link |

00:49:15.000

Right.

link |

00:49:16.000

If you do it now, you have two problems.

link |

00:49:17.000

First, the mathematics breaks down because you have an infinite

link |

00:49:20.000

reward sum, which may give infinity and getting reward 0.1

link |

00:49:24.000

in the time step is infinity and giving reward one every time

link |

00:49:27.000

is infinity.

link |

00:49:28.000

So equally good.

link |

00:49:29.000

Not really what we want.

link |

00:49:31.000

Other problem is that, um, if you have an infinite life,

link |

00:49:35.000

you can be lazy for as long as you want for 10 years

link |

00:49:38.000

and then catch up with the same expected reward.

link |

00:49:41.000

And, you know, think about yourself or, you know,

link |

00:49:44.000

or maybe, you know, some friends or so.

link |

00:49:46.000

Um, if they knew they lived forever, you know,

link |

00:49:50.000

why work hard now?

link |

00:49:51.000

You know, just enjoy your life, you know,

link |

00:49:53.000

and then catch up later.

link |

00:49:54.000

So that's another problem with the infinite horizon.

link |

00:49:56.000

And you mentioned, yes, we can go to discounting.

link |

00:49:59.000

But then the standard discounting is so called geometric discounting.

link |

00:50:02.000

So a dollar today is about worth as much as, you know,

link |

00:50:06.000

$1.05 tomorrow.

link |

00:50:08.000

So if you do the so called geometric discounting,

link |

00:50:10.000

you have introduced an effective horizon.

link |

00:50:12.000

So, um, the agent is now motivated to look ahead a certain amount

link |

00:50:16.000

of time effectively.

link |

00:50:18.000

It's like a moving horizon.

link |

00:50:20.000

And for any fixed effective horizon, there is a problem.

link |

00:50:25.000

To solve, which requires larger horizons.

link |

00:50:28.000

So if I look ahead, you know, five time steps,

link |

00:50:30.000

I'm a terrible chess player, right?

link |

00:50:32.000

I need to look ahead longer.

link |

00:50:34.000

If I play go, I probably have to look ahead even longer.

link |

00:50:36.000

So for every problem, um, no, for every horizon,

link |

00:50:40.000

there is a problem which this horizon cannot solve.

link |

00:50:43.000

But I introduced the so called near harmonic horizon,

link |

00:50:46.000

which goes down with one over T rather than exponentially T,

link |

00:50:49.000

which produces an agent which effectively looks into the future,

link |

00:50:53.000

proportional to each age.

link |

00:50:55.000

So if it's five years old, it plans for five years.

link |

00:50:57.000

If it's a hundred years old, it then plans for a hundred years.

link |

00:50:59.000

Interesting.

link |

00:51:00.000

And it's a little bit similar to humans too, right?

link |

00:51:02.000

I mean, children don't plan ahead very long,

link |

00:51:04.000

but when we get adults, we play ahead more longer.

link |

00:51:07.000

Maybe when we get very old, I mean, we know that we don't live forever.

link |

00:51:10.000

You know, maybe then our horizon shrinks again.

link |

00:51:13.000

So that's really interesting.

link |

00:51:16.000

So adjusting the horizon, what is there some mathematical benefit

link |

00:51:19.000

of that of, or is it just a nice, um, I mean, intuitively, empirically,

link |

00:51:25.000

it will probably be a good idea to sort of push the horizon back,

link |

00:51:28.000

to extend the horizon as you experience more of the world.

link |

00:51:33.000

But is there some mathematical conclusions here that are beneficial?

link |

00:51:37.000

Mr. Lomonov with the actual sort of prediction part,

link |

00:51:39.000

we have extremely strong finite time, um, but no finite data results.

link |

00:51:44.000

So you have so and so much data, then you lose so and so much.

link |

00:51:47.000

So the data is really great.

link |

00:51:49.000

With the ISE model, with the planning part,

link |

00:51:51.000

many results are only asymptotic, um, which, well, this is...

link |

00:51:56.000

What is asymptotic?

link |

00:51:58.000

Asymptotic means you can prove, for instance, that in the long run,

link |

00:52:01.000

if the agent, you know, acts long enough, then, you know,

link |

00:52:04.000

it performs optimal or some nice thing happens.

link |

00:52:06.000

So, but you don't know how fast it converges, yeah?

link |

00:52:09.000

So it may converge fast, but we're just not able to prove it

link |

00:52:12.000

because of a difficult problem.

link |

00:52:14.000

Maybe there's a bug in the model so that it's really that slow.

link |

00:52:19.000

Yeah.

link |

00:52:20.000

So that is what asymptotic means, sort of, eventually,

link |

00:52:23.000

but we don't know how fast.

link |

00:52:25.000

And if I give the agent a fixed horizon M, yeah,

link |

00:52:29.000

then I cannot prove asymptotic results, right?

link |

00:52:32.000

So, I mean, sort of, if it dies in 100 years,

link |

00:52:35.000

then in 100 years it's over.

link |

00:52:37.000

I cannot say eventually.

link |

00:52:38.000

So this is the advantage of the discounting

link |

00:52:40.000

that I can prove asymptotic results.

link |

00:52:43.000

So, just to clarify, so I, okay, I made,

link |

00:52:47.000

I've built up a model.

link |

00:52:49.000

Well, now in the moment, I have this way of looking several steps ahead.

link |

00:52:55.000

How do I pick what action I will take?

link |

00:52:58.000

It's like with a playing chess, right?

link |

00:53:01.000

You do this mini max.

link |

00:53:02.000

In this case here, do you expect the max based on the solomotor distribution?

link |

00:53:06.000

You propagate back and then while an action falls out,

link |

00:53:12.000

the action which maximizes the future expected reward

link |

00:53:15.000

under solomotor distribution and then you just take this action.

link |

00:53:18.000

And then repeat.

link |

00:53:19.000

And then you get a new observation and you feed it in this action,

link |

00:53:22.000

observation, then you repeat.

link |

00:53:23.000

And the reward, so on.

link |

00:53:24.000

Yeah.

link |

00:53:25.000

So you're enrolled too, yeah.

link |

00:53:26.000

And then maybe you can even predict your own action.

link |

00:53:29.000

I love the idea.

link |

00:53:30.000

But, okay, this big framework, what is it?

link |

00:53:34.000

I mean, it's kind of a beautiful mathematical framework

link |

00:53:38.000

to think about artificial general intelligence.

link |

00:53:41.000

What can you, what does it help you into it about how to build such systems?

link |

00:53:49.000

Or maybe from another perspective, what does it help us in understanding AGI?

link |

00:53:56.000

So when I started in the field, I was always interested in two things.

link |

00:54:02.000

One was, you know, AGI.

link |

00:54:04.000

The name didn't exist then.

link |

00:54:06.000

What called general AI or strong AI and the physics here of everything.

link |

00:54:11.000

So I switched back and forth between computer science and physics quite often.

link |

00:54:14.000

You said the theory of everything.

link |

00:54:16.000

The theory of everything.

link |

00:54:17.000

There's basically the biggest problems before all of humanity.

link |

00:54:23.000

Yeah, I can explain if you wanted some later time,

link |

00:54:28.000

why I'm interested in these two questions.

link |

00:54:30.000

Can I ask you, and a small tangent?

link |

00:54:33.000

If it was one to be solved, which one would you,

link |

00:54:38.000

if you were, if an apple fell in your head and there was a brilliant insight

link |

00:54:43.000

and you could arrive at the solution to one, would it be AGI or the theory of everything?

link |

00:54:49.000

Definitely AGI, because once the AGI problem is solved,

link |

00:54:52.000

I can ask the AGI to solve the other problem for me.

link |

00:54:56.000

Yeah, brilliantly put.

link |

00:54:58.000

Okay, so as you were saying about it.

link |

00:55:01.000

Okay, so, and the reason why it didn't settle,

link |

00:55:05.000

I mean, this thought about, you know, once you have solved AGI,

link |

00:55:08.000

it solves all kinds of other, not just the theory of every problem,

link |

00:55:11.000

but all kinds of more useful problems to humanity is very appealing to many people.

link |

00:55:16.000

And, you know, I had this thought also,

link |

00:55:18.000

but I was quite disappointed with the state of the art of the field of AI.

link |

00:55:25.000

There was some theory, you know, about logical reasoning,

link |

00:55:28.000

but I was never convinced that this will fly.

link |

00:55:30.000

And then there was this more, more heuristic approaches with neural networks

link |

00:55:34.000

and I didn't like these heuristics.

link |

00:55:37.000

So, and also I didn't have any good idea myself.

link |

00:55:41.000

So that's the reason why I toggled back and forth quite some while

link |

00:55:45.000

and even worked for four and a half years in a company developing software

link |

00:55:48.000

or something completely unrelated.

link |

00:55:50.000

But then I had this idea about the IXI model.

link |

00:55:53.000

And so what it gives you, it gives you a gold standard.

link |

00:55:58.000

So I have proven that this is the most intelligent agents

link |

00:56:02.000

which anybody could build in quotation mark

link |

00:56:07.000

because it's just mathematical and you need infinite compute.

link |

00:56:11.000

But this is the limit.

link |

00:56:13.000

And this is completely specified.

link |

00:56:15.000

It's not just a framework.

link |

00:56:17.000

You know, every year, tens of frameworks are developed with just skeletons

link |

00:56:22.000

and then pieces are missing.

link |

00:56:24.000

And usually these missing pieces, you know, turn out to be really, really difficult.

link |

00:56:27.000

And so this is completely and uniquely defined.

link |

00:56:31.000

And we can analyze that mathematically.

link |

00:56:33.000

And we've also developed some approximations.

link |

00:56:37.000

I can talk about that a little bit later.

link |

00:56:40.000

That would be sort of the top down approach, like say for Neumann's minimax theory,

link |

00:56:44.000

that's the theoretical optimal play of games.

link |

00:56:47.000

And now we need to approximate it, put heuristics in, prune the tree, blah, blah, blah, and so on.

link |

00:56:51.000

So we can do that also with the IXI model, but for generally I.

link |

00:56:55.000

It can also inspire those and most of most researchers go bottom up, right?

link |

00:57:01.000

They have the systems that try to make it more general, more intelligent.

link |

00:57:04.000

It can inspire in which direction to go.

link |

00:57:07.000

What do you mean by that?

link |

00:57:09.000

So if you have some choice to make, right?

link |

00:57:11.000

So how should they evaluate my system if I can't do cross validation?

link |

00:57:15.000

How should they do my learning if my standard regularization doesn't work well?

link |

00:57:21.000

So the answer is always this, we have a system which does everything that's IXI.

link |

00:57:25.000

It's just completely in the ivory tower, completely useless from a practical point of view.

link |

00:57:30.000

But you can look at it and see, ah, yeah, maybe I can take some aspects.

link |

00:57:35.000

And instead of Kolmogorov complexity, they just take some compressors which has been developed so far.

link |

00:57:40.000

And for the planning, well, we have UCT, which has also been used in Go.

link |

00:57:45.000

And at least it's inspired me a lot to have this formal definition.

link |

00:57:54.000

And if you look at other fields, you know, like I always come back to physics because I have a physics background.

link |

00:57:59.000

Think about the phenomenon of energy that was long time a mysterious concept.

link |

00:58:03.000

And at some point it was completely formalized and that really helped a lot.

link |

00:58:08.000

And you can point out a lot of these things which were first mysterious and vague.

link |

00:58:13.000

And then they have been rigorously formalized.

link |

00:58:15.000

Speed and acceleration has been confused, right, until it was formally defined.

link |

00:58:20.000

There was a time like this.

link |

00:58:21.000

And people, you know, often, you know, who don't have any background, you know, still confuse it.

link |

00:58:27.000

So, and this IXI model or the intelligence definitions, which is sort of the dual to it,

link |

00:58:33.000

we come back to that later, formalizes the notion of intelligence uniquely and rigorously.

link |

00:58:39.000

So, in a sense, it serves as kind of the light at the end of the tunnel.

link |

00:58:43.000

Yes, yeah.

link |

00:58:45.000

So, I mean, there's a million questions I could ask her.

link |

00:58:48.000

So, maybe kind of, okay, let's feel around in the dark a little bit.

link |

00:58:52.000

So, there's been here a deep mind, but in general, been a lot of breakthrough ideas,

link |

00:58:57.000

just like we've been saying around reinforcement learning.

link |

00:58:59.000

So, how do you see the progress in reinforcement learning is different?

link |

00:59:04.000

Like, which subset of IXI does it occupy the current?

link |

00:59:09.000

Like you said, maybe the Markov assumption is made quite often in reinforcement learning.

link |

00:59:16.000

There's other assumptions made in order to make the system work.

link |

00:59:21.000

What do you see as the difference connection between reinforcement learning and IXI?

link |

00:59:26.000

So, the major difference is that essentially all other approaches,

link |

00:59:33.000

they make stronger assumptions.

link |

00:59:35.000

So, in reinforcement learning, the Markov assumption is that the next state or next observation

link |

00:59:41.000

only depends on the previous observation and not the whole history,

link |

00:59:45.000

which makes, of course, the mathematics much easier rather than dealing with histories.

link |

00:59:49.000

Of course, they profit from it also because then you have algorithms that run on current computers

link |

00:59:54.000

and do something practically useful.

link |

00:59:56.000

But for generally I, all the assumptions which are made by other approaches,

link |

01:00:01.000

we know already now they are limiting.

link |

01:00:04.000

So, for instance, usually you need a Godisi assumption in the MDP framework in order to learn.

link |

01:00:11.000

A Godisi essentially means that you can recover from your mistakes

link |

01:00:15.000

and that there are no traps in the environment.

link |

01:00:17.000

And if you make this assumption, then essentially you can go back to a previous state,

link |

01:00:22.000

go there a couple of times and then learn what statistics and what the state is like.

link |

01:00:29.000

And then in the long run perform well in this state.

link |

01:00:33.000

But there are no fundamental problems.

link |

01:00:35.000

But in real life, we know there can be one single action.

link |

01:00:38.000

One second of being inattentive while driving a car fast can ruin the rest of my life.

link |

01:00:45.000

I can become quadruplegic or whatever.

link |

01:00:48.000

So, there's no recovery anymore.

link |

01:00:50.000

So, the real world is not ergodic, I always say.

link |

01:00:52.000

There are traps and there are situations where you're not recovered from.

link |

01:00:56.000

And very little theory has been developed for this case.

link |

01:01:02.000

What about...

link |

01:01:05.000

What do you see in the context of Aixi as the role of exploration?

link |

01:01:10.000

Sort of...

link |

01:01:13.000

You mentioned in the real world, you can get into trouble when we make the wrong decisions and really pay for it.

link |

01:01:19.000

But exploration seems to be fundamentally important for learning about this world, for gaining new knowledge.

link |

01:01:25.000

So, is exploration baked in?

link |

01:01:29.000

Another way to ask it, what are the parameters of this Aixi that can be controlled?

link |

01:01:36.000

Yeah, I say the good thing is that there are no parameters to control.

link |

01:01:40.000

Some other people try knobs to control and you can do that.

link |

01:01:44.000

I mean, you can modify Aixi so that you have some knobs to play with if you want to.

link |

01:01:48.000

But the exploration is directly baked in.

link |

01:01:53.000

And that comes from the Bayesian learning and the long term planning.

link |

01:01:58.000

So, these together already imply exploration.

link |

01:02:04.000

You can nicely and explicitly prove that for simple problems like so called bandit problems,

link |

01:02:13.000

where you say to give a real world example, say you have two medical treatments, A and B,

link |

01:02:20.000

you don't know the effectiveness, you try A a little bit, B a little bit,

link |

01:02:23.000

but you don't want to harm too many patients.

link |

01:02:26.000

So, you have to sort of trade off exploring and at some point you want to explore.

link |

01:02:32.000

And you can do the mathematics and figure out the optimal strategy.

link |

01:02:38.000

They're so called Bayesian agents, they're also non Bayesian agents.

link |

01:02:41.000

But it shows that this Bayesian framework by taking a prior or a possible worlds,

link |

01:02:47.000

doing the Bayesian mixture, then the base optimal decision with long term planning,

link |

01:02:51.000

that is important, automatically implies exploration also to the proper extent.

link |

01:02:58.000

Not too much exploration and not too little.

link |

01:03:00.000

It is very simple settings.

link |

01:03:02.000

In the Aixi model, I was also able to prove that it is a self optimizing theorem

link |

01:03:06.000

or asymptotic optimality theorem, although only asymptotic, not finite time bounds.

link |

01:03:10.000

So, it seems like the long term planning is a really important,

link |

01:03:13.000

but the long term part of the planning is really important.

link |

01:03:16.000

So, maybe a quick tangent.

link |

01:03:19.000

How important do you think is removing the Markov assumption and looking at the full history?

link |

01:03:25.000

Intuitively, of course, it's important, but is it fundamentally transformative

link |

01:03:31.000

to the entirety of the problem?

link |

01:03:33.000

What's your sense of it?

link |

01:03:35.000

Because we make that assumption quite often, just throwing away the past.

link |

01:03:40.000

I think it's absolutely crucial.

link |

01:03:43.000

The question is whether there's a way to deal with it

link |

01:03:47.000

in a more heuristic and still sufficiently well way.

link |

01:03:52.000

So, I have to come up with an example and fly,

link |

01:03:56.000

but you have some key event in your life a long time ago,

link |

01:04:01.000

in some city or something, you realize it's a really dangerous street or whatever, right?

link |

01:04:05.000

And you want to remember that forever, right, in case you come back there.

link |

01:04:10.000

Kind of a selective kind of memory.

link |

01:04:12.000

You remember all the important events in the past,

link |

01:04:15.000

but somehow selecting the importance is...

link |

01:04:17.000

They're very hard, yeah.

link |

01:04:19.000

And I'm not concerned about just storing the whole history.

link |

01:04:22.000

You can calculate human life, say, 30 or 100 years doesn't matter, right?

link |

01:04:28.000

How much data comes in through the vision system and the auditory system.

link |

01:04:33.000

You compress it a little bit, in this case, lossily, and store it.

link |

01:04:37.000

We are soon in the means of just storing it.

link |

01:04:40.000

But you still need to the selection for the planning part

link |

01:04:45.000

and the compression for the understanding part.

link |

01:04:47.000

The raw storage I'm really not concerned about.

link |

01:04:50.000

And I think we should just store, if you develop an agent,

link |

01:04:54.000

preferably just store all the interaction history.

link |

01:04:59.000

And then you build, of course, models on top of it and you compress it

link |

01:05:03.000

and you are selective, but occasionally you go back to the old data

link |

01:05:08.000

and reanalyze it based on your new experience you have.

link |

01:05:12.000

Sometimes you are in school, you learn all these things

link |

01:05:15.000

you think is totally useless and much later you realize,

link |

01:05:18.000

oh, they were not as useless as you thought.

link |

01:05:22.000

I'm looking at you linear algebra.

link |

01:05:24.000

Right.

link |

01:05:25.000

So maybe let me ask about objective functions, because that rewards...

link |

01:05:30.000

It seems to be an important part.

link |

01:05:33.000

The rewards are kind of given to the system.

link |

01:05:37.000

For a lot of people, the specification of the objective function

link |

01:05:45.000

is a key part of intelligence.

link |

01:05:48.000

The agent itself figuring out what is important.

link |

01:05:52.000

What do you think about that?

link |

01:05:54.000

Is it possible within IACC framework to yourself discover

link |

01:06:00.000

their reward based on which you should operate?

link |

01:06:05.000

Okay, that will be a long answer.

link |

01:06:08.000

And that is a very interesting question and I'm asked a lot about this question.

link |

01:06:14.000

Where do the rewards come from?

link |

01:06:16.000

And that depends.

link |

01:06:19.000

And I'll give you now a couple of answers.

link |

01:06:22.000

So if we want to build agents, now let's start simple.

link |

01:06:27.000

So let's assume we want to build an agent based on the IACC model

link |

01:06:31.000

which performs a particular task.

link |

01:06:34.000

Let's start with something super simple, like playing chess or go or something.

link |

01:06:39.000

Then the reward is winning the game is plus one, losing the game is minus one.

link |

01:06:44.000

Done.

link |

01:06:45.000

You apply this agent, if you have enough compute, you let itself play

link |

01:06:49.000

and it will learn the rules of the game, will play perfect chess.

link |

01:06:53.000

After some while, problem solved.

link |

01:06:55.000

So if you have more complicated problems, then you may believe

link |

01:07:03.000

that you have the right reward, but it's not.

link |

01:07:05.000

So a nice cute example is elevator control that is also in Rich Sutton's book,

link |

01:07:11.000

which is a great book, by the way.

link |

01:07:13.000

So you control the elevator and you think, well, maybe the reward should be

link |

01:07:18.000

coupled to how long people wait in front of the elevator.

link |

01:07:20.000

You know, long wait is bad.

link |

01:07:22.000

You program it and you do it.

link |

01:07:24.000

And what happens is the elevator eagerly picks up all the people but never drops them off.

link |

01:07:29.000

So then you realize that maybe the time in the elevator also counts.

link |

01:07:34.000

So you minimize the sum.

link |

01:07:36.000

And the elevator does that, but never picks up the people in the 10th floor

link |

01:07:40.000

and the top floor because in expectation, it's not worth it.

link |

01:07:43.000

Just let them stay.

link |

01:07:45.000

So even in apparently simple problems, you can make mistakes.

link |

01:07:51.000

And that's what in more serious context, say, AGI safety researchers consider.

link |

01:07:58.000

So now let's go back to general agents.

link |

01:08:01.000

So assume we want to build an agent which is generally useful to humans.

link |

01:08:05.000

Yes, we have a household robot here and it should do all kinds of tasks.

link |

01:08:10.000

So in this case, the human should give the reward on the fly.

link |

01:08:15.000

I mean, maybe it's pre trained in the factory and that there's some sort of internal reward

link |

01:08:18.000

for, you know, the battery level or whatever.

link |

01:08:20.000

Yeah, but so it, you know, it does the dishes.

link |

01:08:23.000

Badly, you know, you punish the robot, you does it good, you reward the robot and then train it to a new task.

link |

01:08:28.000

Like a child, right?

link |

01:08:29.000

So you need the human in the loop if you want a system which is useful to the human.

link |

01:08:35.000

And as long as this agent stays sub human level, that should work reasonably well.

link |

01:08:41.000

Apart from, you know, these examples, it becomes critical if they become, you know, on a human level,

link |

01:08:46.000

it's children, small children, you have reasonably well under control.

link |

01:08:49.000

They become older.

link |

01:08:51.000

The reward technique doesn't work so well anymore.

link |

01:08:54.000

So then finally, so this would be agents which are just, you could say slaves to the humans.

link |

01:09:01.000

Yeah.

link |

01:09:02.000

So if you are more ambitious and just say we want to build a new spacious of intelligent beings,

link |

01:09:08.000

we put them on a new planet and we want them to develop this planet or whatever.

link |

01:09:12.000

So we don't give them any reward.

link |

01:09:15.000

So what could we do?

link |

01:09:17.000

And you could try to, you know, come up with some reward functions like, you know,

link |

01:09:22.000

it should maintain itself, the robot, it should maybe multiply, build more robots, right?

link |

01:09:28.000

And, you know, maybe all kinds of things that you find useful, but that's pretty hard, right?

link |

01:09:34.000

You know, what does self maintenance mean?

link |

01:09:36.000

You know, what does it mean to build a copy?

link |

01:09:38.000

Should it be exact copy or an approximate copy?

link |

01:09:40.000

And so that's really hard.

link |

01:09:42.000

But Laurent or so, also at DeepMind, developed a beautiful model.

link |

01:09:48.000

So it just took the ICSE model and coupled the rewards to information gain.

link |

01:09:54.000

So he said the reward is proportional to how much the agent had learned about the world.

link |

01:10:00.000

And you can rigorously formally uniquely define that in terms of our catalog versions.

link |

01:10:05.000

Okay.

link |

01:10:06.000

So if you put that in, you get a completely autonomous agent.

link |

01:10:09.000

And actually, interestingly, for this agent, we can prove much stronger result than for the general agent, which is also nice.

link |

01:10:15.000

And if you let this agent lose, it will be in a sense the optimal scientist.

link |

01:10:20.000

This is absolutely curious to learn as much as possible about the world.

link |

01:10:24.000

And of course, it will also have a lot of instrumental goals, right?

link |

01:10:27.000

In order to learn, it needs to at least survive, right?

link |

01:10:30.000

That agent is not good for anything.

link |

01:10:32.000

So it needs to have self preservation.

link |

01:10:34.000

And if it builds small helpers acquiring more information, it will do that.

link |

01:10:38.000

Yeah.

link |

01:10:39.000

If exploration, space exploration or whatever is necessary, right?

link |

01:10:44.000

To gathering information and develop it.

link |

01:10:46.000

So it has a lot of instrumental goals following on this information gain.

link |

01:10:51.000

And this agent is completely autonomous of us.

link |

01:10:53.000

No rewards necessary anymore.

link |

01:10:55.000

Yeah.

link |

01:10:56.000

Of course, it could find a way to gain the concept of information and get stuck in that library that you mentioned beforehand.

link |

01:11:05.000

With a very large number of books.

link |

01:11:08.000

The first agent had this problem.

link |

01:11:10.000

It would get stuck in front of an old TV screen, which has just had wide noise.

link |

01:11:15.000

Yeah, wide noise.

link |

01:11:16.000

But the second version can deal with at least stochasticity.

link |

01:11:20.000

Well, yeah.

link |

01:11:22.000

What about curiosity?

link |

01:11:23.000

This kind of word, curiosity, creativity.

link |

01:11:27.000

Is that kind of the reward function being of getting new information?

link |

01:11:32.000

Is that similar to idea of kind of injecting exploration for its own sake inside the reward function?

link |

01:11:42.000

Do you find this at all appealing?

link |

01:11:43.000

Interesting.

link |

01:11:44.000

I think that's a nice definition.

link |

01:11:46.000

Curiosity is a reward.

link |

01:11:48.000

Sorry.

link |

01:11:49.000

Curiosity is exploration for its own sake.

link |

01:11:54.000

Yeah.

link |

01:11:55.000

I would accept that.

link |

01:11:57.000

But most curiosity, well, in humans and especially in children, yeah, is not just for its own sake, but for actually learning about the environment and for behaving better.

link |

01:12:08.000

So I think most curiosity is tied in the end to what's performing better.

link |

01:12:14.000

Well, okay.

link |

01:12:15.000

So if intelligence systems need to have this reward function, let me, you're an intelligence system currently passing the torrent test quite effectively.

link |

01:12:26.000

What's the reward function of our human intelligence existence?

link |

01:12:33.000

What's the reward function that Marcus Hutter is operating under?

link |

01:12:37.000

Okay.

link |

01:12:38.000

To the first question, the biological reward function is to survive and to spread.

link |

01:12:44.000

And very few humans sort of are able to overcome this biological reward function.

link |

01:12:50.000

But we live in a very nice world where we have lots of spare time and can still survive and spread.

link |

01:12:58.000

So we can develop arbitrary other interests, which is quite interesting.

link |

01:13:03.000

On top of that.

link |

01:13:04.000

On top of that.

link |

01:13:05.000

Yeah.

link |

01:13:06.000

But the survival and spreading sort of is, I would say, the goal or the reward function of humans that the core one.

link |

01:13:15.000

I like how you avoided answering the second question, which a good intelligence system would.

link |

01:13:19.000

Your own meaning of life and a reward function.

link |

01:13:24.000

My own meaning of life and reward function is to find an AGI to build it.

link |

01:13:31.000

Beautifully put.

link |

01:13:32.000

Okay.

link |

01:13:33.000

Let's dissect the eggs even further.

link |

01:13:34.000

So one of the assumptions is kind of infinity keeps creeping up everywhere.

link |

01:13:42.000

Which, what are your thoughts on kind of bounded rationality and sort of the nature of our existence and intelligence systems is that we're operating always under constraints, under, you know, limited time, limited resources.

link |

01:13:57.000

How does that, how do you think about that within the IXE framework within trying to create an AGI system that operates under these constraints?

link |

01:14:06.000

Yeah, that is one of the criticisms about IXE that it ignores computation and completely and some people believe that intelligence is inherently tied to what's bounded resources.

link |

01:14:19.000

What do you think on this one point?

link |

01:14:21.000

Do you think it's the, do you think the bound of resources are fundamental to intelligence?

link |

01:14:27.000

I would say that an intelligence notion which ignores computational limits is extremely useful.

link |

01:14:35.000

A good intelligence notion which includes these resources would be even more useful, but we don't have that yet.

link |

01:14:43.000

And so look at other fields outside of computer science.

link |

01:14:48.000

Computational aspects never play a fundamental role.

link |

01:14:52.000

You develop biological models for cells, something in physics, these theories, I mean, become more and more crazy and harder and harder to compute.

link |

01:15:00.000

Well, in the end, of course, we need to do something with this model, but there's more nuisance than a feature.

link |

01:15:05.000

And I'm sometimes wondering if artificial intelligence would not sit in a computer science department, but in a philosophy department, then this computational focus would be probably significantly less.

link |

01:15:18.000

I mean, think about the induction problem is more in the philosophy department.

link |

01:15:22.000

There's virtually no paper who cares about, you know, how long it takes to compute the answer.

link |

01:15:26.000

That is completely secondary.

link |

01:15:28.000

Of course, once we have figured out the first problem, so intelligence without computational resources, then the next and very good question is,

link |

01:15:39.000

could we improve it by including computational resources, but nobody was able to do that so far in an even halfway satisfactory manner?

link |

01:15:49.000

I like that.

link |

01:15:50.000

That's in the long run, the right department to belong to is philosophy.

link |

01:15:55.000

That's actually quite a deep idea of or even to at least to think about big picture philosophical questions, big picture questions, even in the computer science department.

link |

01:16:07.000

But you've mentioned approximation, sort of, there's a lot of infinity, a lot of huge resources needed.

link |

01:16:14.000

Are there approximations to IHC that within the IHC framework that are useful?

link |

01:16:19.000

Yeah, we have to develop a couple of approximations.

link |

01:16:22.000

And what we do there is that the Solomov induction part, which was, you know, find the shortest program describing your data, which just replaces by standard data compressors.

link |

01:16:36.000

And the better compressors get, the better this part will become.

link |

01:16:41.000

We focus on a particular compressor called Context Rewaiting, which is pretty amazing, not so well known.

link |

01:16:48.000

It has beautiful theoretical properties, also works reasonably well in practice.

link |

01:16:52.000

So we use that for the approximation of the induction and the learning and the prediction part.

link |

01:16:57.000

And for the planning part, we essentially just took the ideas from a computer go from 2006.

link |

01:17:07.000

It was Java CPSPARI, also now at DeepMind, who developed the so called UCT algorithm, upper confidence bound for trees algorithm on top of the Monte Carlo tree search.

link |

01:17:19.000

So we approximate this planning part by sampling.

link |

01:17:23.000

And it's successful on some small toy problems.

link |

01:17:29.000

We don't want to lose the generality, right?

link |

01:17:33.000

And that's sort of the handicap, right?

link |

01:17:35.000

If you want to be general, you have to give up something.

link |

01:17:39.000

So but this single agent was able to play, you know, small games like Coon poker and tic tac toe and, and even Pacman.

link |

01:17:49.000

And the same architecture, no change, the agent doesn't know the rules of the game, really nothing at all by self or by player with these environments.

link |

01:17:59.000

So you're going to Schmidt, who proposed something called the ghetto machines, which is a self improving program that rewrites its own code.

link |

01:18:09.000

What sort of mathematically or philosophically, what's the relationship in your eyes if you're familiar with it between AXI and the ghetto machines?

link |

01:18:18.000

Yeah, familiar with it. He developed it while I was in his lab.

link |

01:18:22.000

Yeah, so the girl machine to explain briefly, you give it a task.

link |

01:18:28.000

It could be a simple task as, you know, finding prime factors in numbers, right?

link |

01:18:32.000

You can formally write it down. There's a very slow algorithm to do that.

link |

01:18:35.000

Just all try all the factors. Yeah.

link |

01:18:37.000

Or play chess, right?

link |

01:18:39.000

Optimally, you write the algorithm to minimax to the end of the game.

link |

01:18:42.000

So you write down what the girl machine should do.

link |

01:18:46.000

Then it will take part of it resources to run this program and other part of the sources to improve this program.

link |

01:18:54.000

And when it finds an approved version, which provably computes the same answer.

link |

01:19:01.000

So that's the key part. Yeah, it needs to prove by itself that this change of program still satisfies the original specification.

link |

01:19:09.000

And if it does so, then it replaces the original program by the improved program. And by definition, it does the same job, but just faster.

link |

01:19:16.000

Okay. And then, you know, it proves over it and over it.

link |

01:19:19.000

And it's it's it's developed in a way that all parts of this girl machine can self improve, but it stays provably consistent with the original specification.

link |

01:19:31.000

So from this perspective, it has nothing to do with AXI.

link |

01:19:36.000

But if you would now put AXI as the starting axioms in, it would run AXI.

link |

01:19:42.000

But, you know, that takes forever.

link |

01:19:45.000

But then if it finds a provable speed up of AXI, it would replace it by this and this and this and maybe eventually it comes up with a model which is still the AXI model.

link |

01:19:55.000

I mean, just for the knowledgeable reader, AXI is incomputable. And I can prove that therefore there cannot be a computable exact algorithm.

link |

01:20:08.000

Computers then needs to be some approximations.

link |

01:20:10.000

And this is not dealt with the girdle machine.

link |

01:20:12.000

So you have to do something about it.

link |

01:20:13.000

But there's the AXI TL model, which is finally computable, which we could put in which part of AXI is non computable.

link |

01:20:19.000

The Solomon of induction part.

link |

01:20:21.000

But there's ways of getting computable approximations of the AXI model.

link |

01:20:27.000

So then it's at least computable.

link |

01:20:29.000

It is still way beyond any resources anybody will ever have.

link |

01:20:33.000

But then the girdle machine could sort of improve it further and further in an exact way.

link |

01:20:37.000

So this is theoretically possible that the girdle machine process could improve. Isn't AXI already optimal?

link |

01:20:51.000

It is optimal in terms of the reward collected over its interaction cycles, but it takes infinite time to produce one action.

link |

01:21:03.000

And the world continues whether you want it or not.

link |

01:21:07.000

So the model is assuming had an oracle which solved this problem and then in the next 100 milliseconds or the reaction time you need gives the answer, then AXI is optimal.

link |

01:21:17.000

It's optimal in sense of data, also from learning efficiency and data efficiency, but not in terms of computation time.

link |

01:21:26.000

And then the girdle machine in theory, but probably not provably could make it go faster.

link |

01:21:31.000

Those two components are super interesting.

link |

01:21:37.000

The perfect intelligence combined with self improvement.

link |

01:21:44.000

Sort of provable self improvement in sense you're always getting the correct answer and you're improving.

link |

01:21:50.000

Beautiful ideas.

link |

01:21:52.000

I also mentioned that different kinds of things in the chase of solving this reward sort of optimizing for the goal.

link |

01:22:03.000

Interesting human things could emerge.

link |

01:22:05.000

So is there a place for consciousness within AXI?

link |

01:22:10.000

Where does maybe you can comment because I suppose we humans are just another instantiation by AXI agents and we seem to have consciousness.

link |

01:22:21.000

You say humans are an instantiation of an AXI agent.

link |

01:22:23.000

Yes.

link |

01:22:24.000

That would be amazing, but I think that's not true even for the smartest and most rational humans.

link |

01:22:29.000

I think maybe we are very crude approximations.

link |

01:22:33.000

Interesting.

link |

01:22:34.000

I mean, I tend to believe, again, I'm Russian, so I tend to believe our flaws are part of the optimal.

link |

01:22:41.000

So we tend to laugh off and criticize our flaws and I tend to think that that's actually close to an optimal behavior.

link |

01:22:50.000

Well, some flaws, if you think more carefully about it, are actually not flaws, but I think there are still enough flaws.

link |

01:22:58.000

I don't know.

link |

01:22:59.000

It's unclear.

link |

01:23:00.000

As a student of history, I think all the suffering that we've endured as a civilization, it's possible that that's the optimal amount of suffering we need to endure to minimize long term suffering.

link |

01:23:14.000

That's your Russian background.

link |

01:23:16.000

That's the Russian, whether humans are or not instantiations of an AXI agent.

link |

01:23:21.000

Do you think there's consciousness is something that could emerge in a computational form or framework like AXI?

link |

01:23:29.000

Let me also ask you a question.

link |

01:23:31.000

Do you think I'm conscious?

link |

01:23:33.000

That's a good question.

link |

01:23:38.000

That tie is confusing me, but I think so.

link |

01:23:44.000

You think that makes me unconscious because it strangles me?

link |

01:23:47.000

If an agent were to solve the imitation game posed by Turing, I think that would be dressed similarly to you.

link |

01:23:53.000

Because there's a kind of flamboyant, interesting, complex behavior pattern that sells that you're human and you're conscious.

link |

01:24:04.000

But why do you ask?

link |

01:24:06.000

Was it a yes or was it a no?

link |

01:24:08.000

Yes, I think you're conscious, yes.

link |

01:24:12.000

And you explain somehow why, but you infer that from my behavior.

link |

01:24:18.000

You can never be sure about that.

link |

01:24:20.000

And I think the same thing will happen with any intelligent agent we develop if it behaves in a way sufficiently close to humans.

link |

01:24:31.000

Or maybe if not humans, maybe a dog is also sometimes a little bit self conscious.

link |

01:24:36.000

So if it behaves in a way where we attribute typically consciousness, we would attribute consciousness to these intelligent systems.

link |

01:24:44.000

And AXI probably in particular, that of course doesn't answer the question whether it's really conscious.

link |

01:24:50.000

And that's the big heart problem of consciousness.

link |

01:24:53.000

Maybe I'm a zombie.

link |

01:24:55.000

I mean, not the movie zombie, but the philosophical zombie.

link |

01:24:59.000

Is to you the display of consciousness close enough to consciousness from a perspective of AGI that the distinction of the heart problem of consciousness is not an interesting one.

link |

01:25:11.000

I think we don't have to worry about the consciousness problem, especially the heart problem for developing AGI.

link |

01:25:17.000

I think, you know, we progress at some point we have solved all the technical problems and this system will behave intelligent and then super intelligent.

link |

01:25:26.000

And this consciousness will emerge.

link |

01:25:30.000

I mean, definitely it will display behavior, which we will interpret as conscious.

link |

01:25:35.000

And then it's a philosophical question.

link |

01:25:38.000

Did this consciousness really emerge?

link |

01:25:40.000

Or is it a zombie which just, you know, fakes everything?

link |

01:25:43.000

We still don't have to figure that out.

link |

01:25:45.000

Although it may be interesting, at least from a philosophical point of view, it's very interesting, but it may also be sort of practically interesting.

link |

01:25:53.000

You know, there's some people saying, you know, if it's just faking consciousness and feelings, you know, then we don't need to be concerned about, you know, rights.

link |

01:25:59.000

But if it's real conscious and has feelings, then we need to be concerned.

link |

01:26:06.000

I can't wait till the day where AI systems exhibit consciousness because it'll truly be some of the hardest ethical questions of what we do with that.

link |

01:26:16.000

It is rather easy to build systems which people ascribe consciousness.

link |

01:26:21.000

And I give you an analogy.

link |

01:26:23.000

I mean, remember, maybe it was before you were born, the Tamagotchi.

link |

01:26:27.000

How dare you, sir.

link |

01:26:31.000

You're young, right?

link |

01:26:33.000

Yes, that's good. Thank you. Thank you very much.

link |

01:26:36.000

But I was also in the Soviet Union. We didn't have any of those fun things.

link |

01:26:41.000

But you have heard about this Tamagotchi, which was, you know, really, really primitive.

link |

01:26:45.000

Actually, for the time it was, you know, you could raise, you know, this.

link |

01:26:49.000

And kids got so attached to it and, you know, didn't want to let it die.

link |

01:26:53.000

And probably if we would have asked, you know, the children,

link |

01:26:57.000

do you think this Tamagotchi is conscious?

link |

01:26:59.000

They would have said yes.

link |

01:27:01.000

I think that's kind of a beautiful thing, actually, because that consciousness, ascribing consciousness seems to create a deeper connection,

link |

01:27:10.000

which is a powerful thing. But we have to be careful on the ethics side of that.

link |

01:27:15.000

Well, let me ask about the AGI community broadly. You kind of represent some of the most serious work on AGI,

link |

01:27:23.000

at least earlier in DeepMind, represents serious work on AGI these days.

link |

01:27:29.000

But why in your sense is the AGI community so small or has been so small until maybe DeepMind came along?

link |

01:27:38.000

Like why aren't more people seriously working on human level and superhuman level intelligence from a formal perspective?

link |

01:27:48.000

Okay, from a formal perspective, that's sort of, you know, an extra point.

link |

01:27:53.000

So I think there are a couple of reasons. I mean, AGI came in waves, right?

link |

01:27:56.000

You know, AGI winters and AGI summers, and then there were big promises which were not fulfilled.

link |

01:28:01.000

And people got disappointed. And that narrow AI, solving particular problems,

link |

01:28:11.000

which seemed to require intelligence, was always to some extent successful and there were improvements, small steps.

link |

01:28:19.000

And if you build something which is, you know, useful for society or industrial useful, then there's a lot of funding.

link |

01:28:26.000

So I guess it was in parts the money, which drives people to develop specific systems, solving specific tasks.

link |

01:28:36.000

But you would think that, you know, at least in university, you should be able to do ivory tower research.

link |

01:28:43.000

And that was probably better a long time ago, but even nowadays, there's quite some pressure of doing applied research or translational research.

link |

01:28:52.000

And, you know, it's harder to get grants as a theorist.

link |

01:28:56.000

So that also drives people away. It's maybe also harder, attacking the general intelligence problem.

link |

01:29:03.000

So I think enough people, I mean, maybe a small number, we're still interested in formalizing intelligence and thinking of general intelligence.

link |

01:29:13.000

But, you know, not much came up, right? Or not much great stuff came up.

link |

01:29:19.000

So what do you think? We talked about the formal big light at the end of the tunnel.

link |

01:29:25.000

But from the engineering perspective, what do you think it takes to build an AGI system?

link |

01:29:30.000

Is it in, I don't know if that's a stupid question or a distinct question from everything we've been talking about IACC.

link |

01:29:37.000

But what do you see as the steps that are necessary to take to start to try to build something?

link |

01:29:43.000

So you want a blueprint now and then you go off and do it?

link |

01:29:46.000

The whole point of this conversation, trying to squeeze that in there.

link |

01:29:49.000

Now, is there, I mean, what's your intuition? Is it in the robotics space or something that has a body and tries to explore the world?

link |

01:29:56.000

Is it in the reinforcement learning space, like the effort to Alpha 0 and Alpha Star, they're kind of exploring how you can solve it through in the simulation in the gaming world?

link |

01:30:06.000

Is there stuff in sort of all the transformer work in natural language processing, maybe attacking the open domain dialogue?

link |

01:30:16.000

Where do you see the promising pathways?

link |

01:30:19.000

Let me pick the embodiment, maybe.

link |

01:30:24.000

So embodiment is important, yes and no.

link |

01:30:32.000

I don't believe that we need a physical robot walking or rolling around, interacting with the real world in order to achieve AGI.

link |

01:30:44.000

And I think it's more of a distraction, probably, than helpful.

link |

01:30:51.000

It's sort of confusing the body with the mind.

link |

01:30:54.000

For industrial applications or near term applications, of course, we need robots for all kinds of things.

link |

01:31:01.000

But for solving the big problem, at least at this stage, I think it's not necessary.

link |

01:31:08.000

But the answer is also yes, that I think the most promising approach is that you have an agent.

link |

01:31:15.000

And that can be a virtual agent in a computer interacting with an environment, possibly a 3D simulated environment like in many computer games.

link |

01:31:25.000

And you train and learn the agent.

link |

01:31:29.000

Even if you don't intend to later put it sort of, you know, this algorithm in a robot brain and leave it forever in the virtual reality,

link |

01:31:38.000

getting experience in a, although it's just simulated 3D world, is possibly, and as I say, possibly important to understand things on a similar level as humans do.

link |

01:31:54.000

Especially if the agent or primarily if the agent wants, needs to interact with the humans, right?

link |

01:32:00.000

You know, if you talk about objects on top of each other in space and flying in cars and so on, and the agent has no experience with even virtual 3D worlds, it's probably hard to grasp.

link |

01:32:12.000

So if we develop an abstract agent, say we take the mathematical path and we just want to build an agent which can prove theorems and becomes a better and better mathematician,

link |

01:32:21.000

then this agent needs to be able to reason in very abstract spaces and then maybe sort of putting it into 3D environment, simulated world is even harmful.

link |

01:32:30.000

It should sort of, you put it in, I don't know, an environment which it creates itself or so.

link |

01:32:36.000

It seems like you have an interesting, rich complex trajectory through life in terms of your journey of ideas.

link |

01:32:42.000

So it's interesting to ask what books, technical fiction, philosophical books, ideas, people had a transformative effect.

link |

01:32:52.000

Books are most interesting because maybe people could also read those books and see if they could be inspired as well.

link |

01:32:59.000

Yeah, luckily I asked books and not singular book.

link |

01:33:03.000

It's very hard and I tried to pin down one book and I can't do that at the end.

link |

01:33:10.000

So the most, the books which were most transformative for me or which I can most highly recommend to people interested in AI.

link |

01:33:22.000

Both perhaps.

link |

01:33:23.000

Yeah, yeah, both.

link |

01:33:25.000

I would always start with Russell and Norbic, Artificial Intelligence and Modern Approach.

link |

01:33:31.000

That's the AI Bible.

link |

01:33:33.000

It's an amazing book.

link |

01:33:35.000

It's very broad and covers all approaches to AI and even if you focus on one approach, I think that is the minimum you should know about the other approaches out there.

link |

01:33:44.000

So that should be your first book.

link |

01:33:46.000

Fourth edition should be coming out soon.

link |

01:33:48.000

Oh, okay, interesting.

link |

01:33:50.000

There's a deep learning chapter now so there must be.

link |

01:33:53.000

Written by Ian Goodfellow.

link |

01:33:55.000

Okay.

link |

01:33:56.000

And then the next book I would recommend, The Reinforcement Learning Book by Sutton and Bartow.

link |

01:34:02.000

There's a beautiful book.

link |

01:34:05.000

If there's any problem with the book, it makes RL feel and look much easier than it actually is.

link |

01:34:13.000

It's very gentle book.

link |

01:34:15.000

It's very nice to read the exercises.

link |

01:34:17.000

You can very quickly, you know, get some RL systems to run, you know, and very toy problems, but it's a lot of fun.

link |

01:34:23.000

And in a couple of days, you feel, you know, you know what RL is about, but it's much harder than the book.

link |

01:34:33.000

Come on now.

link |

01:34:34.000

It's an awesome book.

link |

01:34:35.000

Yeah, no, it is.

link |

01:34:36.000

Yeah.

link |

01:34:37.000

And maybe, I mean, there's so many books out there.

link |

01:34:41.000

If you like the information theoretic approach, then there's Kolmogorov Complexity by Leon Vitani, but probably, you know, some short article is enough.

link |

01:34:50.000

You don't need to read the whole book, but it's a great book.

link |

01:34:54.000

And if you have to mention one all time favorite book, it's a different flavor.

link |

01:35:01.000

That's a book which is used in the international baccalaureate for high school students in several countries.

link |

01:35:09.000

That's from Nikolas Altjen, Theory of Knowledge, second edition, or first, not the third place.

link |

01:35:16.000

The third one, they put, they took out all the fun.

link |

01:35:19.000

Okay.

link |

01:35:20.000

So this asks all the interesting, or to me, interesting philosophical questions about how we acquire knowledge from all perspectives, you know, from math, from art, from physics, and ask how can we know anything?

link |

01:35:36.000

And the book is called Theory of Knowledge.

link |

01:35:38.000

From which it's almost like a philosophical exploration of how we get knowledge from anything.

link |

01:35:43.000

Yes, yeah.

link |

01:35:44.000

I mean, can religion tell us, you know, about something about the world?

link |

01:35:46.000

Can science tell us something about the world?

link |

01:35:48.000

Can mathematics, or is it just playing with symbols?

link |

01:35:52.000

And you know, it's open ended questions.

link |

01:35:54.000

And I mean, it's for high school students, so they have then resources from Hitchhiker's Guide to the Galaxy, and from Star Wars, and the Chicken Cross the Road, yeah.

link |

01:36:02.000

And it's fun to read, but it's also quite deep.

link |

01:36:07.000

If you could live one day of your life over again, because it made you truly happy, or maybe like we said with the books, it was truly transformative. What day, what moment would you choose that something popped into your mind?

link |

01:36:21.000

Does it need to be a day in the past, or can it be a day in the future?

link |

01:36:25.000

Well, space time is an emergent phenomena, so it's all the same anyway.

link |

01:36:30.000

Okay.

link |

01:36:31.000

Okay, from the past.

link |

01:36:33.000

You're really going to say from the future, I love it.

link |

01:36:36.000

No, I will tell you from the future.

link |

01:36:38.000

Okay, from the past.

link |

01:36:39.000

So from the past, I would say, when I discovered my AXI model, I mean, it was not in one day, but it was one moment where I realized Conmogorff complexity and didn't even know that it existed.

link |

01:36:53.000

But I discovered sort of this compression idea myself, but immediately I knew I can't be the first one, but I had this idea.

link |

01:37:00.000

And then I knew about sequential decisionary, and I knew if I put it together, this is the right thing.

link |

01:37:06.000

And yeah, still when I think back about this moment, I'm super excited about it.

link |

01:37:12.000

Was there any more details in context at that moment?

link |

01:37:16.000

Did an apple fall in your head?

link |

01:37:18.000

If you look at Ian Goodfellow talking about Gans, there was beer involved.

link |

01:37:25.000

Is there some more context of what sparked your thought or was it just?

link |

01:37:31.000

No, it was much more mundane.

link |

01:37:33.000

So I worked in this company.

link |

01:37:34.000

So in this sense, the four and a half years was not completely wasted.

link |

01:37:38.000

So and I worked on an image interpolation problem.

link |

01:37:43.000

And I developed a quite neat new interpolation techniques and they got patented.

link |

01:37:49.000

And then, you know, which happens quite often, I got sort of overboard and thought about, you know, yeah, that's pretty good.

link |

01:37:55.000

But it's not the best.

link |

01:37:56.000

So what is the best possible way of doing interpolation?

link |

01:37:59.000

And then I thought, yeah, you want a simplest picture, which is if you coarser in it, recovers your original picture.

link |

01:38:06.000

And then I thought about the simplicity concept more in quantitative terms.

link |

01:38:11.000

And yeah, then everything developed.

link |

01:38:14.000

And somehow the full beautiful mix of also being a physicist and thinking about the big picture of it, then led you to probably big with I.

link |

01:38:23.000

Yeah.

link |

01:38:24.000

So as a physicist, I was probably trained not to always think in computational terms.

link |

01:38:28.000

You know, just ignore that and think about the fundamental properties which you want to have.

link |

01:38:33.000

So what about if you could really one day in the future?

link |

01:38:36.000

What would that be?

link |

01:38:39.000

When I solve the AGI problem.

link |

01:38:43.000

In practice.

link |

01:38:44.000

In practice.

link |

01:38:45.000

So in theory, I have solved it with the AGI model, but in practice.

link |

01:38:48.000

And then I asked the first question.

link |

01:38:50.000

What would be the first question?

link |

01:38:53.000

What's the meaning of life?

link |

01:38:55.000

I don't think there's a better way to end it.

link |

01:38:58.000

Thank you so much for talking today.

link |

01:38:59.000

It's a huge honor to finally meet you.

link |

01:39:01.000

Yeah.

link |

01:39:02.000

Thank you too.

link |

01:39:03.000

It was a pleasure of mine, too.

link |

01:39:04.000

Thanks for listening to this conversation with Marcus Hutter.

link |

01:39:07.000

And thank you to our presenting sponsor, Cash App.

link |

01:39:10.000

Download it.

link |

01:39:11.000

Use code LEX Podcast.

link |

01:39:12.000

You'll get $10 and $10 will go to FIRST, an organization that inspires and educates young minds to become science and technology innovators of tomorrow.

link |

01:39:22.000

If you enjoy this podcast, subscribe on YouTube.

link |

01:39:25.000

Give it five stars on Apple Podcast.

link |

01:39:27.000

Support on Patreon or simply connect with me on Twitter at Lex Friedman.

link |

01:39:33.000

And now let me leave you with some words of wisdom from Albert Einstein.

link |

01:39:38.000

The measure of intelligence is the ability to change.

link |

01:39:43.000

Thank you for listening and hope to see you next time.

Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI | Lex Fridman Podcast #75