back to index

Bjarne Stroustrup: C++ | Lex Fridman Podcast #48


small model | large model

link |
00:00:00.000
The following is a conversation with Bjarne Stroustrup.
link |
00:00:03.120
He is the creator of C++, a programming language that, after 40 years, is still one of the most
link |
00:00:09.200
popular and powerful languages in the world. Its focus on fast, stable, robust code underlies many
link |
00:00:16.240
of the biggest systems in the world that we have come to rely on as a society. If you're watching
link |
00:00:21.360
this on YouTube, for example, many of the critical back end components of YouTube are written in C++.
link |
00:00:27.280
The same goes for Google, Facebook, Amazon, Twitter, most Microsoft applications, Adobe applications,
link |
00:00:34.320
most database systems, and most physical systems that operate in the real world, like cars, robots,
link |
00:00:41.440
rockets that launch us into space and one day will land us on Mars.
link |
00:00:46.480
C++ also happens to be the language that I use more than any other in my life. I've written
link |
00:00:53.200
several hundred thousand lines of C++ source code. Of course, lines of source code don't mean much,
link |
00:00:59.520
but they do give hints of my personal journey through the world of software.
link |
00:01:04.320
I've enjoyed watching the development of C++ as a programming language,
link |
00:01:08.320
leading up to the big update in the standard in 2011 and those that followed in 14, 17,
link |
00:01:15.360
and toward the new C++20 standard hopefully coming out next year.
link |
00:01:20.320
TITLE This is the Artificial Intelligence podcast.
link |
00:01:23.920
If you enjoy it, subscribe on YouTube, give it five stars on iTunes, support it on Patreon,
link |
00:01:29.280
or simply connect with me on Twitter at Lex Friedman, spelled F R I D M A N.
link |
00:01:34.960
And now, here's my conversation with Björn Stroustrup.
link |
00:01:40.240
What was the first program you've ever written? Do you remember?
link |
00:01:44.000
BJÖRN It was my second year in university, first year of computer science, and it was an Alco 60.
link |
00:01:53.520
I calculated the shape of a super ellipse and then connected points on the perimeter,
link |
00:02:02.400
creating star patterns. It was with a wet ink on a paper printer.
link |
00:02:10.400
TITLE And that was in college, university?
link |
00:02:13.200
BJÖRN Yeah, yeah. I learned to program the second year in university.
link |
00:02:17.120
TITLE And what was the first programming language,
link |
00:02:21.360
if I may ask it this way, that you fell in love with?
link |
00:02:24.960
BJÖRN I think Alco 60. And after that, I remember
link |
00:02:34.960
Snowball. I remember Fortran, didn't fall in love with that. I remember Pascal,
link |
00:02:41.200
didn't fall in love with that. It all got in the way of me. And then I discovered Assembler,
link |
00:02:48.000
and that was much more fun. And from there, I went to Micro Code.
link |
00:02:54.000
TITLE So you were drawn to the, you found the low level stuff beautiful.
link |
00:03:00.640
BJÖRN I went through a lot of languages, and then I spent significant time in Assembler and
link |
00:03:08.080
Micro Code. That was sort of the first really profitable things that paid for my masters,
link |
00:03:14.160
actually. And then I discovered Simula, which was absolutely great.
link |
00:03:18.480
TITLE Simula?
link |
00:03:20.080
BJÖRN Simula was the extension of Alco 60,
link |
00:03:25.840
done primarily for simulation. But basically, they invented object oriented programming at
link |
00:03:31.280
inheritance and runtime polymorphism while they were doing it. And that was the language that
link |
00:03:40.960
taught me that you could have the sort of the problems of a program grow with the size of the
link |
00:03:48.320
program rather than with the square of the size of the program. That is, you can actually modularize
link |
00:03:55.440
very nicely. And that was a surprise to me. It was also a surprise to me that a stricter type
link |
00:04:04.880
system than Pascal's was helpful, whereas Pascal's type system got in my way all the time. So you
link |
00:04:13.440
need a strong type system to organize your code well, but it has to be extensible and flexible.
link |
00:04:20.800
TITLE Let's get into the details a little bit. If you remember, what kind of type system did
link |
00:04:25.680
Pascal have? What type system, typing system did Alco 60 have?
link |
00:04:30.320
BJÖRN Basically, Pascal was sort of the simplest language that Niklaus Wirth could
link |
00:04:37.760
define that served the needs of Niklaus Wirth at the time. And it has a sort of a highly moral
link |
00:04:47.200
tone to it. That is, if you can say it in Pascal, it's good. And if you can't, it's not so good.
link |
00:04:54.560
Whereas Simula allowed you basically to build your own type system. So instead of trying to fit
link |
00:05:05.280
yourself into Niklaus Wirth's world, Christen Nygaard's language and Johan Dahl's language
link |
00:05:13.920
allowed you to build your own. So it's sort of close to the original idea of you build a domain
link |
00:05:22.640
specific language. As a matter of fact, what you build is a set of types and relations among types
link |
00:05:31.200
that allows you to express something that's suitable for an application.
link |
00:05:35.440
TITLE So when you say types,
link |
00:05:38.160
stuff you're saying has echoes of object oriented programming.
link |
00:05:41.600
BJÖRN Yes, they invented it. Every language
link |
00:05:44.800
that uses the word class for type is a descendant of Simula, directly or indirectly. Christen Nygaard
link |
00:05:55.840
and Ole Johan Dahl were mathematicians and they didn't think in terms of types, but they understood
link |
00:06:04.720
sets and classes of elements. And so they called their types classes. And basically in C++,
link |
00:06:13.840
as in Simula, classes are user defined type.
link |
00:06:17.040
TITLE So can you try the impossible task and give
link |
00:06:21.120
a brief history of programming languages from your perspective? So we started with Algol 60,
link |
00:06:28.240
Simula, Pascal, but that's just the 60s and 70s. BJÖRN I can try. The most sort of interesting and
link |
00:06:39.440
major improvement of programming languages was Fortran, the first Fortran. Because before that,
link |
00:06:47.040
all code was written for a specific machine and each specific machine had a language,
link |
00:06:52.400
a simple language or a cross simpler or some extension of that idea. But you're writing for
link |
00:07:02.000
a specific machine in the language of that machine. And Bacchus and his team at IBM built a language
link |
00:07:12.880
that would allow you to write what you really wanted. That is, you could write it in a language
link |
00:07:21.440
that was natural for people. Now, these people happen to be engineers and physicists. So the
link |
00:07:26.880
language that came out was somewhat unusual for the rest of the world. But basically they said
link |
00:07:31.680
formula translation because they wanted to have the mathematical formulas translated into the
link |
00:07:37.200
machine. And as a side effect, they got portability because now they're writing in the terms that the
link |
00:07:47.120
humans used and the way humans thought. And then they had a program that translated it into the
link |
00:07:54.000
machine's needs. And that was new and that was great. And it's something to remember. We want to
link |
00:08:02.240
raise the language to the human level, but we don't want to lose the efficiency.
link |
00:08:09.200
And that was the first step towards the human. That was the first step. And of course,
link |
00:08:15.200
they were a very particular kind of humans. Business people were different, so they got
link |
00:08:20.320
cobalt instead, et cetera, et cetera. And Simula came out. No, let's not go to Simula yet. Let's
link |
00:08:28.400
go to Algol. Fortran didn't have, at the time, the notions of not a precise notion of type,
link |
00:08:38.640
not a precise notion of scope, not a set of translation phases that was what we have today,
link |
00:08:49.440
lexical syntax, semantics. It was sort of a bit of a model in the early days, but
link |
00:08:56.560
hey, they've just done the biggest breakthrough in the history of programming, right?
link |
00:09:01.360
So you can't criticize them for not having gotten all the technical details right.
link |
00:09:06.320
So we got Algol. That was very pretty. And most people in commerce and science considered it
link |
00:09:15.040
useless because it was not flexible enough, and it wasn't efficient enough, and et cetera,
link |
00:09:20.880
et cetera. But that was a breakthrough from a technical point of view.
link |
00:09:26.800
Then Simula came along to make that idea more flexible, and you could define your own types.
link |
00:09:34.400
And that's where I got very interested. Christen Nygård was the main idea man behind Simula.
link |
00:09:43.920
That was late 60s.
link |
00:09:45.200
This was late 60s. Well, I was a visiting professor in Aarhus, and so I learned object
link |
00:09:52.000
oriented programming by sitting around and, well, in theory, discussing with Christen Nygård. But
link |
00:10:04.400
Christen, once you get started and in full flow, it's very hard to get a word in edgeways.
link |
00:10:09.680
Where you just listen.
link |
00:10:10.560
So it was great. I learned it from there.
link |
00:10:14.160
Not to romanticize the notion, but it seems like a big leap to think about
link |
00:10:18.720
object oriented programming. It's really a leap of abstraction.
link |
00:10:24.960
Yes.
link |
00:10:25.760
And was that as big and beautiful of a leap as it seems from now in retrospect,
link |
00:10:34.720
or was it an obvious one at the time?
link |
00:10:38.080
It was not obvious, and many people have tried to do something like that,
link |
00:10:44.480
and most people didn't come up with something as wonderful as Simula.
link |
00:10:49.920
Lots of people got their PhDs and made their careers out of forgetting about Simula or never
link |
00:10:56.720
knowing it. For me, the key idea was basically I could get my own types. And that's the idea that
link |
00:11:05.600
goes further into C++, where I can get better types and more flexible types and more efficient
link |
00:11:12.640
types. But it's still the fundamental idea. When I want to write a program, I want to write it with
link |
00:11:17.760
my types that is appropriate to my problem and under the constraints that I'm under with hardware,
link |
00:11:27.200
software, environment, et cetera. And that's the key idea.
link |
00:11:32.720
People picked up on the class hierarchies and the virtual functions and the inheritance,
link |
00:11:39.120
and that was only part of it. It was an interesting and major part and still a major
link |
00:11:47.360
part and a lot of graphic stuff, but it was not the most fundamental. It was when you wanted to
link |
00:11:55.440
relate one type to another. You don't want them all to be independent. The classical example is
link |
00:12:01.680
that you don't actually want to write a city simulation with vehicles where you say, well,
link |
00:12:11.120
if it's a bicycle, write the code for turning a bicycle to the left. If it's a normal car,
link |
00:12:17.440
turn right the normal car way. If it's a fire engine, turn right the fire engine way. You get
link |
00:12:23.440
these big case statements and bunches of if statements and such. Instead, you tell the base
link |
00:12:32.080
class that that's the vehicle saying, turn left the way you want to. And this is actually a real
link |
00:12:41.520
example. They used it to simulate and optimize the emergency services for somewhere in Norway back in
link |
00:12:53.920
the 60s. So this was one of the early examples for why you needed inheritance and you needed
link |
00:13:01.680
a runtime polymorphism because you wanted to handle this set of vehicles in a manageable way.
link |
00:13:13.920
You can't just rewrite your code each time a new kind of vehicle comes along.
link |
00:13:19.840
Yeah, that's a beautiful, powerful idea. And of course it stretches through your work
link |
00:13:24.720
with C++ as we'll talk about. But I think you've structured it nicely. What other breakthroughs
link |
00:13:33.120
came along in the history of programming languages if we were to tell the history in that way?
link |
00:13:39.440
Obviously, I'm better at telling the part of the history that is the path I'm on as opposed to all
link |
00:13:46.000
the paths. Yeah, you skipped the hippie John McCarthy and Lisp, one of my favorite languages.
link |
00:13:51.440
Functional.
link |
00:13:53.440
But Lisp is not one of my favorite languages. It's obviously important. It's obviously interesting.
link |
00:13:59.760
Lots of people write code in it and then they rewrite it into C or C++ when they want to go
link |
00:14:05.680
to production. It's in the world I'm at, which are constrained by performance, reliability,
link |
00:14:15.600
issues, deployability, cost of hardware. I don't like things to be too dynamic.
link |
00:14:24.560
It is really hard to write a piece of code that's perfectly flexible that you can also deploy on a
link |
00:14:32.560
small computer and that you can also put in, say, a telephone switch in Bogota. What's the chance?
link |
00:14:40.000
If you get an error and you find yourself in the debugger that the telephone switch in Bogota on
link |
00:14:45.520
late Sunday night has a programmer around, their chance is zero. A lot of the things I think most
link |
00:14:53.600
about can't afford that flexibility. I'm quite aware that maybe 70%, 80% of all code are not
link |
00:15:07.120
under the kind of constraints I'm interested in. But somebody has to do the job I'm doing
link |
00:15:14.640
because you have to get from these high level flexible languages to the hardware.
link |
00:15:20.560
The stuff that lasts for 10, 20, 30 years is robust, operates under very constrained conditions.
link |
00:15:27.120
Yes, absolutely. That's right. And it's fascinating and beautiful in its own way.
link |
00:15:31.920
C++ is one of my favorite languages, and so is Lisp. So I can embody two for different reasons
link |
00:15:40.880
as a programmer. I understand why Lisp is popular, and I can see the beauty of the ideas
link |
00:15:50.480
and similarly with Smalltalk. It's just not as relevant in my world. And by the way, I distinguish
link |
00:16:04.400
between those and the functional languages where I go to things like ML and Haskell. Different kind
link |
00:16:12.560
of languages, they have a different kind of beauty and they're very interesting. And I think
link |
00:16:19.040
that's interesting. And I actually try to learn from all the languages I encounter to see what is
link |
00:16:27.680
there that would make working on the kind of problems I'm interested in with the kind of
link |
00:16:34.880
constraints that I'm interested in, what can actually be done better? Because we can surely
link |
00:16:42.880
do better than we do today. You've said that it's good for any professional programmer to know at
link |
00:16:50.400
least five languages as speaking about a variety of languages that you've taken inspiration from,
link |
00:16:57.920
and you've listed yours as being, at least at the time, C++, obviously, Java, Python, Ruby,
link |
00:17:06.320
script. Can you first of all, update that list, modify it? You don't have to be constrained
link |
00:17:15.200
to just five, but can you describe what you picked up also from each of these languages?
link |
00:17:21.680
How do you see them as inspirations for you when you're working with C++?
link |
00:17:25.840
This is a very hard question to answer. So about languages, you should know
link |
00:17:32.880
languages. I reckon I knew about 25 or thereabouts when I did C++. It was easier in those days
link |
00:17:42.000
because the languages were smaller, and you didn't have to learn a whole programming environment and
link |
00:17:48.640
such to do it. You could learn the language quite easily. And it's good to learn so many languages.
link |
00:17:55.280
I imagine, just like with natural language for communication, there's different
link |
00:18:03.120
paradigms that emerge in all of them, that there's commonalities and so on.
link |
00:18:08.800
So I picked five out of a hat. You picked five out of a hat.
link |
00:18:12.640
Obviously. The important thing that the number is not one.
link |
00:18:17.680
That's right.
link |
00:18:19.120
It's like, I don't like, I mean, if you're a monoglot, you are likely to think that your
link |
00:18:24.240
own culture is the only one superior to everybody else's. A good learning of a foreign language and
link |
00:18:30.080
a foreign culture is important. It helps you think and be a better person. With programming
link |
00:18:36.160
languages, you become a better programmer, better designer with the second language.
link |
00:18:41.680
Now, once you've got two, the weight of five is not that long. It's the second one that's
link |
00:18:48.640
most important. And then when I had to pick five, I sort of thinking what kinds of languages are
link |
00:18:58.880
there? Well, there's a really low level stuff. It's good. It's actually good to know machine code.
link |
00:19:04.960
Even still?
link |
00:19:05.920
Even today. The C++ optimizers write better machine code than I do.
link |
00:19:13.200
Yes. But I don't think I could appreciate them if I actually didn't understand machine code and
link |
00:19:20.640
machine architecture. At least in my position, I have to understand a bit of it because you mess
link |
00:19:28.160
up the cache and you're off in performance by a factor of 100. It shouldn't be that if you are
link |
00:19:35.280
interested in either performance or the size of the computer you have to deploy. So I would
link |
00:19:42.640
go as a simpler. I used to mention C, but these days going low level is not actually what gives
link |
00:19:51.280
you the performance. It is to express your ideas so cleanly that you can think about it and the
link |
00:19:57.840
optimizer can understand what you're up to. My favorite way of optimizing these days is to throw
link |
00:20:04.240
away out the clever bits and see if it still runs fast. And sometimes it runs faster. So I need the
link |
00:20:13.360
abstraction mechanisms or something like C++ to write compact high performance code. There was a
link |
00:20:20.960
beautiful keynote by Jason Turner at the CppCon a couple of years ago where he decided he was going
link |
00:20:27.680
to program Pong on Motorola 6800, I think it was. And he says, well, this is relevant because it
link |
00:20:40.320
looks like a microcontroller. It has specialized hardware. It has not very much memory and it's
link |
00:20:46.400
relatively slow. And so he shows in real time how he writes Pong starting with fairly straightforward
link |
00:20:56.000
low level stuff, improving his abstractions and what he's doing. He's writing C++ and it translates
link |
00:21:06.560
into 86 assembler, which you can do with Clang and you can see it in real time. It's
link |
00:21:14.640
the compiler explorer, which you can use on the web. And then he wrote a little program
link |
00:21:19.360
that translated 86 assembler into Motorola assembler. And so he types and you can see this
link |
00:21:27.840
thing in real time. Wow. You can see it in real time. And even if you can't read the assembly code,
link |
00:21:33.840
you can just see it. His code gets better. The code, the assembler gets smaller.
link |
00:21:39.600
He increases the abstraction level, uses C++ 11 as it were better.
link |
00:21:45.440
This code gets cleaner. It gets easier maintainable. The code shrinks and it keeps shrinking. And
link |
00:21:55.120
I could not in any reasonable amount of time write that assembler as good as the compiler
link |
00:22:02.240
generated from really quite nice modern C++. And I'll go as far as to say the thing that looked
link |
00:22:09.360
like C was significantly uglier and smaller and larger when it became machine code.
link |
00:22:22.160
So the abstractions that can be optimized are important.
link |
00:22:27.280
I would love to see that kind of visualization in larger code bases.
link |
00:22:30.880
Yeah. That might be beautiful.
link |
00:22:31.920
But you can't show a larger code base in a one hour talk and have it fit on screen.
link |
00:22:38.000
Right. So that's C and C++.
link |
00:22:40.240
So my two languages would be machine code and C++. And then I think you can learn a
link |
00:22:47.120
lot from the functional languages. So PIC has gloy ML. I don't care which. I think actually
link |
00:22:54.800
you learn the same lessons of expressing especially mathematical notions really clearly
link |
00:23:03.200
and having a type system that's really strict. And then you should probably have a language for sort
link |
00:23:11.520
of quickly churning out something. You could pick JavaScript. You could pick Python. You could pick
link |
00:23:19.280
Ruby. What do you make of JavaScript in general? So you're talking in the platonic sense about
link |
00:23:26.800
languages, about what they're good at, what their philosophy of design is. But there's also a large
link |
00:23:32.880
user base behind each of these languages and they use it in the way sometimes maybe it wasn't
link |
00:23:38.400
really designed for. That's right. JavaScript is used way beyond probably what it was designed for.
link |
00:23:44.240
Let me say it this way. When you build a tool, you do not know how it's going to be used.
link |
00:23:49.440
You try to improve the tool by looking at how it's being used and when people cut their fingers
link |
00:23:55.200
off and try and stop that from happening. But really you have no control over how something
link |
00:24:01.360
is used. So I'm very happy and proud of some of the things C++ is being used at and some of the
link |
00:24:07.840
things I wish people wouldn't do. Bitcoin mining being my favorite example uses as much energy as
link |
00:24:15.120
Switzerland and mostly serves criminals. But back to the languages, I actually think that having
link |
00:24:25.440
JavaScript run in the browser was an enabling thing for a lot of things. Yes, you could have
link |
00:24:33.440
done it better, but people were trying to do it better and they were using more principles,
link |
00:24:41.520
language designs, but they just couldn't do it right. And the nonprofessional programmers that
link |
00:24:49.280
write lots of that code just couldn't understand them. So it did an amazing job for what it was.
link |
00:24:58.640
It's not the prettiest language and I don't think it ever will be the prettiest language, but
link |
00:25:05.200
let's not be bigots here. So what was the origin story of C++?
link |
00:25:10.400
Yeah, you basically gave a few perspectives of your inspiration of object oriented programming.
link |
00:25:19.280
That's you had a connection with C and performance efficiency was an important
link |
00:25:24.400
thing you were drawn to. Efficiency and reliability. Reliability. You have to get both.
link |
00:25:30.720
What's reliability? I really want my telephone calls to get through and I want the quality
link |
00:25:38.640
of what I am talking, coming out at the other end. The other end might be in London or wherever.
link |
00:25:48.400
And you don't want the system to be crashing. If you're doing a bank, you mustn't crash. It might
link |
00:25:55.760
be your bank account that is in trouble. There's different constraints like in games, it doesn't
link |
00:26:02.640
matter too much if there's a crash, nobody dies and nobody gets ruined.
link |
00:26:06.560
But I am interested in the combination of performance, partly because of sort of speed
link |
00:26:14.560
of things being done, part of being able to do things that is necessary to have reliability
link |
00:26:24.320
of larger systems. If you spend all your time interpreting a simple function call,
link |
00:26:32.560
a simple function call, you are not going to have enough time to do proper signal processing to get
link |
00:26:39.280
the telephone calls to sound right. Either that or you have to have ten times as many computers
link |
00:26:46.000
and you can't afford your phone anymore. It's a ridiculous idea in the modern world because
link |
00:26:51.760
we have solved all of those problems. I mean, they keep popping up in different ways because
link |
00:26:58.160
we tackle bigger and bigger problems. So efficiency remains always an important aspect.
link |
00:27:03.120
But you have to think about efficiency, not just as speed, but as an enabler to
link |
00:27:09.680
important things. And one of the things it enables is reliability, is dependability.
link |
00:27:18.720
When I press the pedal, the brake pedal of a car, it is not actually connected directly
link |
00:27:24.800
to anything but a computer. That computer better work.
link |
00:27:31.680
Let's talk about reliability just a little bit. So modern cars have ECUs, have millions of lines
link |
00:27:39.520
of code today. So this is certainly especially true of autonomous vehicles where some of the
link |
00:27:46.560
aspects of the control or driver assistance systems that steer the car, that keep it in the
link |
00:27:50.720
lane and so on. So how do you think, you know, I talked to regulators, people in government
link |
00:27:56.800
who are very nervous about testing the safety of these systems of software. Ultimately software
link |
00:28:03.360
that makes decisions that could lead to fatalities. So how do we test software systems like these?
link |
00:28:11.920
First of all, safety, like performance and like security is the system's property.
link |
00:28:21.840
People tend to look at one part of a system at a time and saying something like, this is secure.
link |
00:28:28.800
That's all right. I don't need to do that. Yeah, that piece of code is secure. I'll buy
link |
00:28:34.480
your operator. If you want to have reliability, if you want to have performance, if you want to
link |
00:28:41.600
have security, you have to look at the whole system. I did not expect you to say that,
link |
00:28:46.640
but that's very true. Yes, I'm dealing with one part of the system and I want my part to be really
link |
00:28:52.480
good, but I know it's not the whole system. Furthermore, if making an individual part perfect,
link |
00:29:00.320
may actually not be the best way of getting the highest degree of reliability and performance and
link |
00:29:05.680
such. There's people that say C++ is not type safe. You can break it. Sure. I can break anything
link |
00:29:14.080
that runs on a computer. I may not go through your type system. If I wanted to break into your
link |
00:29:20.800
computer, I'll probably try SQL injection. And it's very true. If you think about
link |
00:29:26.400
safety or even reliability at the system level, especially when a human being is involved,
link |
00:29:34.080
it starts becoming hopeless pretty quickly in terms of proving that something is
link |
00:29:43.840
safe to a certain level. Yeah. Because there's so many variables. It's so complex. Well, let's get
link |
00:29:48.800
back to something we can talk about and actually talk about it. Yeah.
link |
00:29:54.000
Talk about and actually make some progress on. Yes. We can look at C++ programs and we can
link |
00:30:01.680
try and make sure they crash this often. The way you do that is largely by simplification.
link |
00:30:14.640
The first step is to simplify the code, have less code, have code that are less likely to go wrong.
link |
00:30:21.440
It's not by runtime testing everything. It is not by big test frameworks that you are using.
link |
00:30:28.960
Yes, we do that also. But the first step is actually to make sure that when you want to
link |
00:30:35.600
express something, you can express it directly in code rather than going through endless loops
link |
00:30:43.120
and convolutions in your head before it gets down the code. The way you are thinking about
link |
00:30:51.360
a problem is not in the code. There is a missing piece that's just in your head. And the code,
link |
00:30:59.360
you can see what it does, but it cannot see what you thought about it unless you have expressed
link |
00:31:05.920
things directly. When you express things directly, you can maintain it. It's easier to find errors.
link |
00:31:13.120
It's easier to make modifications. It's actually easier to test it. And lo and behold, it runs
link |
00:31:19.280
faster. And therefore, you can use a smaller number of computers, which means there's less
link |
00:31:26.720
hardware that could possibly break. So I think the key here is simplification.
link |
00:31:34.000
But it has to be, to use the Einstein quote, as simple as possible and no simpler.
link |
00:31:40.080
Not simpler.
link |
00:31:41.600
There are other areas with under constraints where you can be simpler than you can be in C++.
link |
00:31:46.880
But in the domain I'm dealing with, that's the simplification I'm after.
link |
00:31:53.360
So how do you inspire or ensure that the Einstein level of simplification is reached?
link |
00:32:03.280
So can you do code review? Can you look at code? If I gave you the code for the Ford F150
link |
00:32:11.840
and said, here, is this a mess or is this okay? Is it possible to tell? Is it possible to regulate?
link |
00:32:23.040
An experienced developer can look at code and see if it smells. Mixed metaphors deliberately.
link |
00:32:31.680
Yes. The point is that it is hard to generate something that is really obviously clean and
link |
00:32:46.880
can be appreciated. But you can usually recognize when you haven't reached that point.
link |
00:32:52.160
And so I've never looked at the F150 code, so I wouldn't know. But I know what I ought to be
link |
00:33:03.360
looking for. I'll be looking for some tricks that correlate with bugs and elsewhere. And I have tried
link |
00:33:12.080
to formulate rules for what good code looks like. And the current version of that is called the C++
link |
00:33:22.480
core guidelines. One thing people should remember is there's what you can do in a language and what
link |
00:33:32.240
you should do. In a language, you have lots of things that is necessary in some context,
link |
00:33:39.600
but not in others. There's things that exist just because there's 30 year old code out there and
link |
00:33:45.680
you can't get rid of it. But you can't have rules that says when you create it, try and follow these
link |
00:33:51.200
rules. This does not create good programs by themselves, but it limits the damage from mistakes.
link |
00:34:02.480
It limits the possibilities of mistakes. And basically, we are trying to say, what is it that
link |
00:34:08.960
a good programmer does? At the fairly simple level of where you use the language and how you use it.
link |
00:34:16.240
Now, I can put all the rules for chiseling in marble. It doesn't mean that somebody who follows
link |
00:34:24.640
all of those rules can do a masterpiece by Michelangelo. That is, there's something else
link |
00:34:32.160
to write a good program. Just is there something else to create an important work of art? That is,
link |
00:34:40.640
there's some kind of inspiration, understanding, gift. But we can approach the sort of technical,
link |
00:34:53.920
the craftsmanship level of it. The famous painters, the famous sculptures was among other things,
link |
00:35:03.440
superb craftsmen. They could express their ideas using their tools very well. And so these days,
link |
00:35:14.320
I think what I'm doing, what a lot of people are doing, we are still trying to figure out how it is
link |
00:35:20.000
to use our tools very well. For a really good piece of code, you need a spark of inspiration,
link |
00:35:29.280
and you can't, I think, regulate that. You cannot say that I'll take a picture only,
link |
00:35:37.200
I'll buy your picture only if you're at least Van Gogh. There are other things you can regulate,
link |
00:35:45.600
but not the inspiration. I think that's quite beautifully put. It is true that there is as an
link |
00:35:55.200
experienced programmer, when you see code that's inspired, that's like Michelangelo, you know it
link |
00:36:04.640
when you see it. And the opposite of that is code that is messy, code that smells, you know,
link |
00:36:12.240
when you see it. And I'm not sure you can describe it in words, except vaguely through guidelines and
link |
00:36:17.760
so on. Yes, it's easier to recognize ugly than to recognize beauty in code. And for the reason is
link |
00:36:27.040
that sometimes beauty comes from something that's innovative and unusual. And you have to sometimes
link |
00:36:34.000
think reasonably hard to appreciate that. On the other hand, the messes have things that are
link |
00:36:41.040
in common. And you can have static checkers and dynamic checkers that find
link |
00:36:52.080
a large number of the most common mistakes. You can catch a lot of sloppiness mechanically. I'm
link |
00:37:02.400
a great fan of static analysis in particular, because you can check for not just the language
link |
00:37:09.920
rules, but for the usage of language rules. And I think we will see much more static analysis
link |
00:37:16.880
in the coming decade. Can you describe what static analysis is? You represent a piece of code
link |
00:37:25.840
so that you can write a program that goes over that representation and look for things that are
link |
00:37:33.760
are right and not right. So, for instance, you can analyze a program to see if
link |
00:37:46.000
resources are leaked. That's one of my favorite problems. It's not actually all that hard and
link |
00:37:54.240
modern C++, but you can do it. If you are writing in the C level, you have to have a malloc and a
link |
00:38:00.320
free. And they have to match. If you have them in a single function, you can usually do it very
link |
00:38:08.880
easily. If there's a malloc here, there should be a free there. On the other hand, in between can be
link |
00:38:16.320
showing complete code and then it becomes impossible. If you pass that pointer to the
link |
00:38:22.000
memory out of a function and then want to make sure that the free is done somewhere else,
link |
00:38:31.600
now it gets really difficult. And so for static analysis, you can run through a program and you
link |
00:38:38.000
can try and figure out if there's any leaks. And what you will probably find is that you will find
link |
00:38:47.120
some leaks and you'll find quite a few places where your analysis can't be complete. It might
link |
00:38:54.240
depend on runtime. It might depend on the cleverness of your analyzer and it might take a
link |
00:39:02.880
long time. Some of these programs run for a long time. But if you combine such analysis
link |
00:39:11.120
with a set of rules that says how people could use it, you can actually see why the rules are
link |
00:39:17.120
violated. And that stops you from getting into the impossible complexities. You don't want to
link |
00:39:25.040
solve the halting problem. So static analysis is looking at the code without running the code.
link |
00:39:31.040
Yes. And thereby it's almost not a production code, but it's almost like an education tool
link |
00:39:38.880
of how the language should be used. It guides you like it at its best, right? It would
link |
00:39:45.440
guide you in how you write future code as well. And you learn together.
link |
00:39:50.320
Yes. So basically you need a set of rules for how you use the language. Then you need a static
link |
00:39:56.400
analysis that catches your mistakes when you violate the rules or when your code ends up
link |
00:40:05.120
doing things that it shouldn't, despite the rules, because there is the language rules.
link |
00:40:09.200
We can go further. And again, it's back to my idea that I'd much rather find errors before
link |
00:40:16.000
I start running the code. If nothing else, once the code runs, if it catches an error at run times,
link |
00:40:23.280
I have to have an error handler. And one of the hardest things to write in code is error handling
link |
00:40:30.160
code, because you know something went wrong. Do you know really exactly what went wrong?
link |
00:40:36.640
Usually not. How can you recover when you don't know what the problem was? You can't be 100% sure
link |
00:40:42.960
what the problem was in many, many cases. And this is part of it. So yes, we need good languages,
link |
00:40:52.480
we need good type systems, we need rules for how to use them, we need static analysis. And the
link |
00:41:02.240
ultimate for static analysis is of course program proof, but that still doesn't scale to the kind
link |
00:41:08.320
of systems we deploy. Then we start needing testing and the rest of the stuff.
link |
00:41:15.200
So C++ is an object oriented programming language that creates, especially with its newer versions,
link |
00:41:22.960
as we'll talk about, higher and higher levels of abstraction. So how do you design?
link |
00:41:30.400
Let's even go back to the origin of C++. How do you design something with so much abstraction
link |
00:41:35.040
that's still efficient and is still something that you can manage, do static analysis on,
link |
00:41:45.200
you can have constraints on, they can be reliable, all those things we've talked about.
link |
00:41:50.480
To me, there's a slight tension between high level abstraction and efficiency.
link |
00:41:59.440
That's a good question. I could probably have a year's course just trying to answer it.
link |
00:42:06.080
Yes, there's a tension between efficiency and abstraction, but you also get the interesting
link |
00:42:13.200
situation that you get the best efficiency out of the best abstraction. And my main tool
link |
00:42:21.600
for efficiency for performance actually is abstraction. So let's go back to how C++ was
link |
00:42:28.320
got there. You said it was object oriented programming language. I actually never said that.
link |
00:42:35.040
It's always quoted, but I never did. I said C++ supports object oriented programming and other
link |
00:42:42.880
techniques. And that's important because I think that the best solution to most complex,
link |
00:42:51.520
interesting problems require ideas and techniques from things that has been called object oriented
link |
00:43:02.880
data abstraction, functional, traditional C style code, all of the above. And so when I was designing
link |
00:43:14.960
C++, I soon realized I couldn't just add features. If you just add what looks pretty or what people
link |
00:43:24.560
ask for or what you think is good, one by one, you're not going to get a coherent whole. What
link |
00:43:32.080
you need is a set of guidelines that that guides your decisions. Should this feature be in or should
link |
00:43:40.560
this feature be out? How should a feature be modified before it can go in and such?
link |
00:43:48.640
And in the book I wrote about that, the design evolution of C++, there's a whole bunch of rules
link |
00:43:56.080
like that. Most of them are not language technical. They're things like don't violate static type
link |
00:44:04.400
system because I like static type system for the obvious reason that I like things to be reliable
link |
00:44:12.480
on reasonable amounts of hardware. But one of these rules is the zero overhead principle.
link |
00:44:21.280
The what kind of principle?
link |
00:44:22.000
The zero overhead principle. It basically says that if you have an abstraction,
link |
00:44:29.600
it should not cost anything compared to write the equivalent code at a lower level.
link |
00:44:38.960
So if I have, say, a matrix multiply, it should be written in such a way that you could not drop to
link |
00:44:50.000
the C level of abstraction and use arrays and pointers and such and run faster.
link |
00:44:54.800
And so people have written such matrix multiplications, and they've actually gotten
link |
00:45:01.920
code that ran faster than Fortran because once you had the right abstraction, you can eliminate
link |
00:45:08.640
temporaries and you can do loop fusion and other good stuff like that. That's quite hard to do by
link |
00:45:16.560
hand and in a lower level language. And there's some really nice examples of that.
link |
00:45:21.600
And the key here is that that matrix multiplication, the matrix abstraction,
link |
00:45:29.120
allows you to write code that's simple and easy. You can do that in any language.
link |
00:45:34.000
But with C++, it has the features so that you can also have this thing run faster than if you hand
link |
00:45:39.840
coded it. Now, people have given that lecture many times, I and others, and a very common
link |
00:45:47.680
question after the talk where you have demonstrated that you can outperform Fortran for
link |
00:45:52.800
dense matrix multiplication, people come up and says, yeah, but that was C++.
link |
00:45:57.680
If I rewrote your code in C, how much faster would it run? The answer is much slower.
link |
00:46:06.080
This happened the first time actually back in the 80s with a friend of mine called Doug McElroy,
link |
00:46:11.920
who demonstrated exactly this effect. And so the principle is you should give programmers the tools
link |
00:46:22.080
so that the abstractions can follow the zero void principle. Furthermore, when you put in a language
link |
00:46:28.560
feature in C++ or a standard library feature, you try to meet this. It doesn't mean it's absolutely
link |
00:46:35.680
optimal, but it means if you hand code it with the usual facilities in the language in C++ in C,
link |
00:46:45.040
you should not be able to better it. Usually you can do better if you use embedded assembler for
link |
00:46:53.360
machine code for some of the details to utilize part of a computer that the compiler doesn't know
link |
00:47:00.000
about. But you should get to that point before you beat to the abstraction. So that's a beautiful
link |
00:47:06.880
ideal to reach for. And we meet it quite often. Quite often. So where's the magic of that coming
link |
00:47:14.640
from? There's some of it is the compilation process. So the implementation of C++, some of it
link |
00:47:20.560
is the design of the feature itself, the guidelines. So I think it's important that you
link |
00:47:27.280
think about the guidelines. So I've recently and often talked to Chris Latner, so Clang.
link |
00:47:36.320
What, just out of curiosity, is your relationship in general with the different implementations of
link |
00:47:44.160
C++ as you think about you and committee and other people in C++, think about the design of
link |
00:47:50.480
features or design of previous features. In trying to reach the ideal of zero overhead,
link |
00:47:59.840
does the magic come from the design, the guidelines, or from the implementations?
link |
00:48:06.480
And not all. You go for programming technique,
link |
00:48:13.840
programming language features, and implementation techniques. You need all three.
link |
00:48:18.000
And how can you think about all three at the same time?
link |
00:48:22.640
It takes some experience, takes some practice, and sometimes you get it wrong. But after a while,
link |
00:48:28.160
you sort of get it right. I don't write compilers anymore. But Brian Kernighan pointed out that one
link |
00:48:37.840
of the reasons C++ succeeded was some of the craftsmanship I put into the early compilers.
link |
00:48:49.760
And of course, I did the language assign. Of course, I wrote a fair amount of code using
link |
00:48:54.080
this kind of stuff. And I think most of the successes involve progress in all three areas
link |
00:49:02.720
together. A small group of people can do that. Two, three people can work together to do something
link |
00:49:10.400
like that. It's ideal if it's one person that has all the skills necessary. But nobody has all the
link |
00:49:16.160
skills necessary in all the fields where C++ is used. So if you want to approach my ideal in, say,
link |
00:49:23.840
concurrent programming, you need to know about algorithms from current programming. You need to
link |
00:49:30.240
know the trigger of lock free programming. You need to know something about compiler techniques.
link |
00:49:36.960
And then you have to know some of the application areas where this is, like some forms of graphics
link |
00:49:46.720
or some forms of what we call web server kind of stuff. And that's very hard to get into a single
link |
00:49:57.440
head. But small groups can do it too. So is there differences in your view, not saying which is
link |
00:50:06.800
better or so on, but differences in the different implementations of C++? Why are there several
link |
00:50:13.680
sort of maybe naive questions for me? GCC, clang, so on? This is a very reasonable question. When
link |
00:50:23.680
I designed C++, most languages had multiple implementations. Because if you run on an IBM,
link |
00:50:35.520
if you run on a Sun, if you run on a Motorola, there was just many, many companies and they each
link |
00:50:41.440
have their own compilation structure and their own compilers. It was just fairly common that
link |
00:50:47.200
there was many of them. And I wrote C Front assuming that other people would write compilers
link |
00:50:54.720
with C++ if successful. And furthermore, I wanted to utilize all the backend infrastructures that
link |
00:51:04.320
were available. I soon realized that my users were using 25 different linkers. I couldn't write my
link |
00:51:10.240
own linker. Yes, I could, but I couldn't write 25 linkers and also get any work done on the language.
link |
00:51:20.080
And so it came from a world where there was many linkers, many optimizers, many
link |
00:51:27.120
compiler front ends, not to start, but many operating systems. The whole world was not an
link |
00:51:36.080
86 and a Linux box or something, whatever is the standard today. In the old days, they set a VAX.
link |
00:51:45.040
So basically, I assumed there would be lots of compilers. It was not a decision that there should
link |
00:51:51.520
be many compilers. It was just a fact. That's the way the world is. And yes, many compilers
link |
00:52:00.400
emerged. And today, there's at least four front ends, Clang, GCC, Microsoft, and EDG,
link |
00:52:13.600
it is design group. They supply a lot of the independent organizations and the embedded
link |
00:52:21.440
systems industry. And there's lots and lots of backends. We have to think about how many dozen
link |
00:52:29.040
backends there are. Because different machines have different things, especially in the embedded
link |
00:52:35.760
world, the machines are very different, the architectures are very different. And so having
link |
00:52:43.920
a single implementation was never an option. Now, I also happen to dislike monocultures.
link |
00:52:53.120
Monocultures.
link |
00:52:54.320
They are dangerous. Because whoever owns the monoculture can go stale. And there's no
link |
00:53:01.920
competition. And there's no incentive to innovate. There's a lot of incentive to put barriers in the
link |
00:53:09.360
way of change. Because hey, we own the world. And it's a very comfortable world for us. And who are
link |
00:53:15.680
you to mess with that? So I really am very happy that there's four front ends for C++. Clang's
link |
00:53:26.400
great. But GCC was great. But then it got somewhat stale. Clang came along. And GCC is much better
link |
00:53:36.320
now. Microsoft is much better now. So at least a low number of front ends puts a lot of pressure on
link |
00:53:51.040
standards compliance and also on performance and error messages and compile time speed,
link |
00:53:57.760
all this good stuff that we want.
link |
00:53:59.360
Do you think, crazy question, there might come along, do you hope there might come along
link |
00:54:08.800
implementation of C++ written, given all its history, written from scratch?
link |
00:54:16.400
So written today from scratch?
link |
00:54:18.960
Well, Clang and the LLVM is more or less written from scratch.
link |
00:54:24.880
But there's been C++ 11, 14, 17, 20. You know, there's been a lot of
link |
00:54:30.960
I think sooner or later somebody's going to try again. There has been attempts to write
link |
00:54:36.480
new C++ compilers and some of them has been used and some of them has been absorbed into
link |
00:54:42.400
others and such. Yeah, it'll happen.
link |
00:54:45.200
So what are the key features of C++? And let's use that as a way to sort of talk about
link |
00:54:52.960
the evolution of C++, the new features. So at the highest level, what are the features
link |
00:54:59.360
that were there in the beginning? What features got added?
link |
00:55:03.200
Let's first get a principle or an aim in place. C++ is for people who want to use hardware
link |
00:55:13.600
really well and then manage the complexity of doing that through abstraction.
link |
00:55:18.720
And so the first facility you have is a way of manipulating the machines at a fairly low
link |
00:55:27.120
level. That looks very much like C. It has loops, it has variables, it has pointers like
link |
00:55:36.560
machine addresses, it can access memory directly, it can allocate stuff in the absolute minimum
link |
00:55:45.040
of space needed on the machine. There's a machine facing part of C++ which is roughly
link |
00:55:52.560
equivalent to C. I said C++ could beat C and it can. It doesn't mean I dislike C. If I
link |
00:55:59.120
disliked C, I wouldn't have built on it. Furthermore, after Dennis Ritchie, I'm probably the major
link |
00:56:07.760
contributor to modern C. I had lunch with Dennis most days for 16 years and we never
link |
00:56:18.160
had a harsh word between us. So these C versus C++ fights are for people who don't quite
link |
00:56:26.960
understand what's going on. Then the other part is the abstraction. The key is the class.
link |
00:56:34.800
There, the key is the class which is a user defined type. My idea for the class is that
link |
00:56:42.480
you should be able to build a type that's just like the building types in the way you
link |
00:56:48.400
use them, in the way you declare them, in the way you get the memory and you can do
link |
00:56:54.480
just as well. So in C++ there's an int as in C. You should be able to build an abstraction,
link |
00:57:03.680
a class which we can call capital int that you can use exactly like an integer and run
link |
00:57:11.360
just as fast as an integer. There's the idea right there. And of course you probably don't
link |
00:57:18.080
want to use the int itself but it has happened. People have wanted integers that were range
link |
00:57:25.600
checked so that you couldn't overflow and such, especially for very safety critical
link |
00:57:29.840
applications like the fuel injection for a marine diesel engine for the largest ships.
link |
00:57:37.040
This is a real example by the way. This has been done. They built themselves an integer
link |
00:57:43.360
that was just like integer except that couldn't overflow. If there was an overflow you went
link |
00:57:49.200
into the error handling. And then you built more interesting types. You can build a matrix
link |
00:57:56.880
which you need to do graphics or you could build a gnome for a video game.
link |
00:58:04.400
And all these are classes and they appear just like the built in types.
link |
00:58:07.760
Exactly.
link |
00:58:08.240
In terms of efficiency and so on. So what else is there?
link |
00:58:11.120
And flexibility.
link |
00:58:12.320
So I don't know, for people who are not familiar with object oriented programming there's inheritance.
link |
00:58:20.400
There's a hierarchy of classes. You can just like you said create a generic vehicle that can turn
link |
00:58:27.040
left.
link |
00:58:27.600
So what people found was that you don't actually know. How do I say this? A lot of types are
link |
00:58:40.320
related. That is the vehicles, all vehicles are related. Bicycles, cars, fire engines, tanks. They
link |
00:58:52.960
have some things in common and some things that differ. And you would like to have the common
link |
00:58:57.600
things common and having the differences specific. And when you didn't want to know about
link |
00:59:04.160
the differences, just turn left. You don't have to worry about it. That's how you get the traditional
link |
00:59:12.640
object oriented programming coming out of Simula adopted by Smalltalk and C++ and all the other
link |
00:59:19.520
languages. The other kind of obvious similarity between types comes when you have something like
link |
00:59:25.840
a vector. Fortran gave us the vector as called array of doubles. But the minute you have a
link |
00:59:35.760
vector of doubles, you want a vector of double precision doubles and for short doubles for
link |
00:59:42.720
graphics. And why should you not have a vector of integers while you're added or a vector of
link |
00:59:50.640
vectors and a vector of vectors of chess pieces? Now you have a board, right? So this is you
link |
01:00:01.040
express the commonality as the idea of a vector and the variations come through parameterization.
link |
01:00:10.080
And so here we get the two fundamental ways of abstracting or of having similarities of
link |
01:00:17.360
types in C++. There's the inheritance and there's a parameterization. There's the object oriented
link |
01:00:23.120
programming and there's the generic programming. With the templates for the generic programming.
link |
01:00:27.920
Yep. So you've presented it very nicely, but now you have to make all that happen and make it
link |
01:00:37.040
efficient. So generic programming with templates, there's all kinds of magic going on, especially
link |
01:00:43.280
recently that you can help catch up on. But it feels to me like you can do way more than what
link |
01:00:50.160
you just said with templates. You can start doing this kind of metaprogramming, this kind of...
link |
01:00:55.760
You can do metaprogramming also. I didn't go there in that explanation. We're trying to be
link |
01:01:02.320
very basic, but go back on to the implementation. If you couldn't implement this efficiently,
link |
01:01:08.160
if you couldn't use it so that it became efficient, it has no place in C++ because
link |
01:01:14.400
it will violate the zero overhead principle. So when I had to get object oriented programming
link |
01:01:22.560
inheritance, I took the idea of virtual functions from Simula. Virtual functions is a Simula term,
link |
01:01:31.360
class is a Simula term. If you ever use those words, say thanks to Christen Nygaard and Olli
link |
01:01:38.640
Høndahl. And I did the simplest implementation I knew of, which was basically a jump table.
link |
01:01:47.520
So you get the virtual function table, the function goes in, does an indirection through
link |
01:01:54.080
a table and get the right function. That's how you pick the right thing there.
link |
01:01:58.000
And I thought that was trivial. It's close to optimal and it was obvious. It turned out the
link |
01:02:06.000
Simula had a more complicated way of doing it and therefore was slower. And it turns out that most
link |
01:02:12.400
languages have something that's a little bit more complicated, sometimes more flexible,
link |
01:02:16.880
but you pay for it. And one of the strengths of C++ was that you could actually do this object
link |
01:02:22.880
oriented stuff and your overhead compared to ordinary functions, there's no indirection. It's
link |
01:02:30.800
sort of in 5, 10, 25% just the call. It's down there. It's not two. And that means you can
link |
01:02:40.400
afford to use it. Furthermore, in C++, you have the distinction between a virtual function and
link |
01:02:46.960
a nonvirtual function. If you don't want any overhead, if you don't need the indirection that
link |
01:02:53.040
gives you the flexibility in object oriented programming, just don't ask for it. So the idea
link |
01:03:00.640
is that you only use virtual functions if you actually need the flexibility. So it's not zero
link |
01:03:06.640
overhead, but it's zero overhead compared to any other way of achieving the flexibility.
link |
01:03:11.360
Now, auto parameterization. Basically, the compiler looks at the template, say the vector,
link |
01:03:25.040
and it looks at the parameter, and then combines the two and generates a piece of code that is
link |
01:03:34.400
exactly as if you've written a vector of that specific type. So that's the minimal overhead.
link |
01:03:42.400
If you have many template parameters, you can actually combine code that the compiler couldn't
link |
01:03:47.920
usually see at the same time and therefore get code that is faster than if you had handwritten
link |
01:03:56.080
the stuff, unless you are very, very clever. So the thing is, parameterized code, the compiler
link |
01:04:04.800
fills stuff in during the compilation process, not during runtime. That's right. And furthermore,
link |
01:04:12.320
it gives all the information it's gotten, which is the template, the parameter, and the context
link |
01:04:20.160
of use. It combines the three and generates good code. But it can generate, now, it's a little
link |
01:04:30.480
outside of what I'm even comfortable thinking about, but it can generate a lot of code. Yes.
link |
01:04:36.560
And how do you, I remember being both amazed at the power of that idea, and
link |
01:04:45.440
how ugly the debugging looked? Yes. Debugging can be truly horrid.
link |
01:04:51.520
Come back to this, because I have a solution. Anyway, the debugging was ugly.
link |
01:04:58.320
The code generated by C++ has always been ugly, because there's these inherent optimizations.
link |
01:05:06.880
A modern C++ compiler has front end, middle end, and back end.
link |
01:05:10.720
Even C Front, back in 83, had front end and back end optimizations. I actually took the code,
link |
01:05:20.320
generated an internal representation, munched that representation to generate good code.
link |
01:05:27.680
So people say, it's not a compiler, it generates C. The reason it generated C was I wanted to use
link |
01:05:33.200
C's code generator, and I wanted to use C's code generator to generate good code.
link |
01:05:38.480
C was I wanted to use C's code generators that was really good at back end optimizations.
link |
01:05:44.080
But I needed front end optimizations, and therefore, the C I generated was optimized C.
link |
01:05:51.280
The way a really good handcrafted optimizer human could generate it, and it was not meant
link |
01:06:00.560
for humans. It was the output of a program, and it's much worse today. And with templates,
link |
01:06:06.160
it gets much worse still. So it's hard to combine simple debugging with the optimal code,
link |
01:06:16.960
because the idea is to drag in information from different parts of the code to generate good code,
link |
01:06:25.680
machine code. And that's not readable. So what people often do for debugging
link |
01:06:34.240
is they turn the optimizer off. And so you get code that when something in your source code
link |
01:06:42.720
looks like a function call, it is a function call. When the optimizer is turned on, it may disappear,
link |
01:06:50.480
the function call, it may inline. And so one of the things you can do is you can actually get code
link |
01:06:58.400
that is smaller than the function call, because you eliminate the function preamble and return.
link |
01:07:06.320
And there's just the operation there. One of the key things when I did
link |
01:07:13.280
templates was I wanted to make sure that if you have, say, a sort algorithm, and you give it a
link |
01:07:20.080
sorting criteria, if that sorting criteria is simply comparing things with less than,
link |
01:07:31.360
the code generated should be the less than, not an indirect function call to a comparison
link |
01:07:40.560
object, which is what it is in the source code. But we really want down to the single instruction.
link |
01:07:47.120
But anyway, turn off the optimizer, and you can debug. The first level of debugging can be done,
link |
01:07:54.240
and I always do without the optimization on, because then I can see what's going on.
link |
01:07:59.360
And then there's this idea of concepts that puts some, now I've never even,
link |
01:08:09.360
I don't know if it was ever available in any form, but it puts some constraints
link |
01:08:14.640
on the stuff you can parameterize, essentially.
link |
01:08:18.240
Let me try and explain this. So yes, it wasn't there 10 years ago. We have had versions of it
link |
01:08:27.440
that actually work for the last four or five years. It was a design by Gabi Dos Reis, Drew
link |
01:08:37.120
Sutton and me. We were professors and postdocs in Texas at the time. And the implementation by
link |
01:08:46.720
Andrew Sutton has been available for that time. And it is part of C++20. And there's a standard
link |
01:08:59.040
library that uses it. So this is becoming really very real. It's available in Clang and GCC. GCC
link |
01:09:10.640
for a couple of years, and I believe Microsoft is soon going to do it. We expect all of C++20
link |
01:09:17.120
to be available in all the major compilers in 20. But this kind of stuff is available now.
link |
01:09:26.720
I'm just saying that because otherwise people might think I was talking about science fiction.
link |
01:09:31.920
And so what I'm going to say is concrete. You can run it today.
link |
01:09:37.040
And there's production uses of it. So the basic idea is that when you have a generic component,
link |
01:09:47.200
like a sort function, the sort function will require at least two parameters. One is the
link |
01:09:54.560
data structure with a given type and a comparison criteria. And these things are related, but
link |
01:10:03.680
obviously you can't compare things if you don't know what the type of things you compare.
link |
01:10:10.160
And so you want to be able to say, I'm going to sort something and it is to be sortable.
link |
01:10:16.880
What does it mean to be sortable? You look it up in the standard. It has to have a
link |
01:10:20.960
it has to be a sequence with a beginning and an end. There has to be random access to that sequence.
link |
01:10:27.200
And there has to be the element types has to be comparable by default.
link |
01:10:34.800
Which means less than operator can operate on.
link |
01:10:37.040
Yes.
link |
01:10:37.600
Less than logical operator can operate.
link |
01:10:39.120
Basically what concepts are, they're compile time predicates. They're predicates you can ask,
link |
01:10:45.360
are you a sequence? Yes, I have a beginning and end. Are you a random access sequence? Yes,
link |
01:10:52.800
I have a subscripting and plus. Is your element type something that has a less than? Yes,
link |
01:11:00.560
I have a less than it's and so basically that's the system. And so instead of saying,
link |
01:11:06.960
I will take a parameter of any type, it'll say, I'll take something that's sortable.
link |
01:11:11.440
And it's well defined. And so we say, okay, you can sort with less than, I don't want less than,
link |
01:11:17.920
I want greater than or something I invent. So you have two parameters, the sortable thing and the
link |
01:11:24.720
comparison criteria. And the comparison criteria will say, well, I can, you can write it saying it
link |
01:11:33.920
should operate on the element type. And then you can say, well, I can sort with less than,
link |
01:11:41.200
and it has the comparison operations. So that's just simply the fundamental thing. It's compile
link |
01:11:49.200
time predicates. Do you have the properties I need? So it specifies the requirements of the code
link |
01:11:56.320
on the parameters that it gets. It's very similar to types actually. But operating in the space of
link |
01:12:05.280
concepts. Concepts. The word concept was used by Alex Stefanov, who is sort of the father of generic
link |
01:12:15.200
programming in the context of C++. There's other places that use that word, but the way we call
link |
01:12:23.520
it generic programming is Alex's. And he called them concepts because he said they are the sort
link |
01:12:29.040
of the fundamental concepts of an area. So they should be called concepts. And we've had
link |
01:12:34.720
concepts all the time. If you look at the KNR book about C, C has arithmetic types and it has
link |
01:12:45.760
integral types. It says so in the book. And then it lists what they are and they have certain
link |
01:12:52.480
properties. The difference today is that we can actually write a concept that will ask a type,
link |
01:12:59.200
are you an integral type? Do you have the properties necessary to be an integral type?
link |
01:13:05.200
Do you have plus, minus, divide and such? So maybe the story of concepts, because I thought
link |
01:13:15.200
it might be part of C++11. C O X or whatever it was at the time. What was the, why didn't it,
link |
01:13:25.680
what, like what we'll, we'll talk a little bit about this fascinating process of standards,
link |
01:13:30.560
because I think it's really interesting for people. It's interesting for me,
link |
01:13:34.000
but why did it take so long? What shapes did the idea of concepts take?
link |
01:13:41.760
What were the challenges? Back in 87 or thereabouts. 1987?
link |
01:13:49.120
Well, 1987 or thereabouts when I was designing templates, obviously I wanted to express the
link |
01:13:54.960
notion of what is required by a template of its arguments. And so I looked at this and basically
link |
01:14:03.920
for templates, I wanted three properties. I wanted to be very flexible. It had to be able to express
link |
01:14:14.000
things I couldn't imagine because I know I can't imagine everything. And I've been suffering from
link |
01:14:20.480
languages that try to constrain you to only do what the designer thought good. Didn't want to
link |
01:14:27.600
do that. Secondly, it had to run faster, as fast or faster than handwritten code. So basically,
link |
01:14:35.920
if I have a vector of T and I take a vector of char, it should run as fast as you built a vector
link |
01:14:43.520
of char yourself without parameterization. And thirdly, I wanted to be able to express
link |
01:14:52.480
the constraints of the arguments, have proper type checking of the interfaces.
link |
01:15:01.680
And neither I nor anybody else at the time knew how to get all three. And I thought for C++,
link |
01:15:09.360
I must have the two first. Otherwise, it's not C++. And it bothered me for another couple of
link |
01:15:17.040
decades that I couldn't solve the third one. I mean, I was the one that put function argument
link |
01:15:23.600
type checking into C. I know the value of good interfaces. I didn't invent that idea. It's very
link |
01:15:29.840
common, but I did it. And I wanted to do the same for templates, of course, and I couldn't.
link |
01:15:37.600
So it bothered me. Then we tried again, 2002, 2003. Gaby DesRays and I started analyzing the
link |
01:15:47.840
problem, explained possible solutions. It was not a complete design. A group in University of Indiana,
link |
01:15:57.280
an old friend of mine, they started a project at Indiana and we thought we could get
link |
01:16:11.360
a good system of concepts in another two or three years that would have made C++ 11 to C++
link |
01:16:22.000
06 or 07. Well, it turns out that I think we got a lot of the fundamental ideas wrong. They were
link |
01:16:33.280
too conventional. They didn't quite fit C++ in my opinion. Didn't serve implicit conversions very
link |
01:16:41.920
well. It didn't serve mixed type arithmetic, mixed type computations very well. A lot of
link |
01:16:51.120
stuff came out of the functional community and that community didn't deal with multiple types
link |
01:17:03.200
in the same way as C++ does, had more constraints on what you could express and didn't have the
link |
01:17:12.480
draconian performance requirements. And basically we tried. We tried very hard. We had some
link |
01:17:19.760
successes, but it just in the end wasn't, didn't compile fast enough, was too hard to use and
link |
01:17:31.440
didn't run fast enough unless you had optimizers that was beyond the state of the art. They still
link |
01:17:40.080
are. So we had to do something else. Basically it was the idea that a set of parameters has
link |
01:17:49.120
defined a set of operations and you go through an interaction table just like for virtual functions
link |
01:17:55.760
and then you try to optimize the interaction away to get performance. And we just couldn't
link |
01:18:03.360
do all of that. But get back to the standardization. We are standardizing C++ under ISO rules,
link |
01:18:12.720
which are very open process. People come in, there's no requirements for education or experience.
link |
01:18:20.160
So you started to develop C++ and there's a whole, when was the first standard established? What is
link |
01:18:28.960
that like? The ISO standard, is there a committee that you're referring to? There's a group of
link |
01:18:34.960
people. What was that like? How often do you meet? What's the discussion?
link |
01:18:39.280
I'll try and explain that. So sometime in early 1989, two people, one from IBM, one from HP,
link |
01:18:52.720
turned up in my office and told me I would like to standardize C++. This was a new idea to me and
link |
01:19:02.080
when I pointed out that it wasn't finished yet and it wasn't ready for formal standardization
link |
01:19:09.760
and such. And they say, no, Bjarne, you haven't gotten it. You really want to do this.
link |
01:19:16.400
Our organizations depend on C++. We cannot depend on something that's owned by another
link |
01:19:23.760
corporation that might be a competitor. Of course we could rely on you, but you might get run over
link |
01:19:31.040
by a boss. We really need to get this out in the open. It has to be standardized under formal rules
link |
01:19:41.840
and we are going to standardize it under ISO rules and you really want to be part of it because
link |
01:19:51.120
basically otherwise we'll do it ourselves. And we know you can do it better. So through a combination
link |
01:20:00.800
of arm twisting and flattery, it got started. So in late 89, there was a meeting in DC at the,
link |
01:20:15.600
actually no, it was not ISO then, it was ANSI, the American National Standard doing.
link |
01:20:23.200
We met there. We were lectured on the rules of how to do an ANSI standard. There was about 25 of us
link |
01:20:30.480
there, which apparently was a new record for that kind of meeting. And some of the old C guys that
link |
01:20:38.800
has been standardized in C was there. So we got some expertise in. So the way this works is that
link |
01:20:45.440
it's an open process. Anybody can sign up if they pay the minimal fee, which is about a thousand
link |
01:20:52.720
dollars, less than a little bit more now. And I think it's $1,280. It's not going to kill you.
link |
01:21:01.680
And we have three meetings a year. This is fairly standard. We tried two meetings a year for a
link |
01:21:10.880
couple of years that didn't work too well. So three one week meetings a year and you meet
link |
01:21:20.160
and you have technical discussions, and then you bring proposals forward for votes. The votes are
link |
01:21:28.320
done one person per, one vote per organization. So you can't have say IBM come in with 10 people
link |
01:21:39.040
and dominate things that's not allowed. And these are organizations that extensively UC
link |
01:21:44.160
plus plus. Yes. Or individuals or individuals. I mean, it's a bunch of people in the room
link |
01:21:53.280
deciding the design of a language based on which a lot of the world's systems run.
link |
01:22:00.400
Right. Well, I think most people would agree it's better than if I decided it
link |
01:22:06.240
or better than if a single organization like AG&T decides it. I don't know if everyone agrees to
link |
01:22:13.200
that, by the way. Bureaucracies have their critics too. Yes. Look, standardization is not pleasant.
link |
01:22:23.360
It's horrifying. It's like democracy. Exactly. As Churchill says, democracy is the worst way,
link |
01:22:31.200
except for the others. Right. And it's, I would say the same with formal standardization.
link |
01:22:36.480
But anyway, so we meet and we have these votes and that determines what the standard is.
link |
01:22:45.040
A couple of years later, we extended this so it became worldwide. We have standard organizations
link |
01:22:53.280
that are active in currently 15 to 20 countries and another 15 to 20 are sort of looking and voting
link |
01:23:08.800
based on the rest of the work on it. And we meet three times a year. Next week I'll be in Cologne,
link |
01:23:15.680
Germany, spending a week doing standardization and we'll vote out the committee draft of C++20,
link |
01:23:25.440
which goes to the national standards committees for comments and requests for changes and
link |
01:23:34.000
improvements. Then we do that and there's a second set of votes where hopefully everybody
link |
01:23:39.600
votes in favor. This has happened several times. The first time we finished, we started in the
link |
01:23:47.040
first technical meeting was in 1990. The last was in 98. We voted it out. That was the standard
link |
01:23:55.760
that people used until 11 or a little bit past 11. And it was an international standard. All the
link |
01:24:04.000
countries voted in favor. It took longer with 11. I'll mention why, but all the nations voted in
link |
01:24:13.440
favor. And we work on the basis of consensus. That is, we do not want something that passes 6040
link |
01:24:24.400
because then we're going to get dialects and opponents and people complain too much. They
link |
01:24:30.240
all complain too much, but basically it has no real effect. The standards has been obeyed. They
link |
01:24:37.280
have been working to make it easier to use many compilers, many computers and all of that kind of
link |
01:24:44.880
stuff. It was traditional with ISO standards to take 10 years. We did the first one in eight,
link |
01:24:54.080
brilliant. And we thought we were going to do the next one in six because now we are good at it.
link |
01:25:00.400
Right. It took 13. Yeah. It was named OX. It was named OX. Hoping that you would at least get it
link |
01:25:10.720
within the single, within the odds, the single digits. I thought we would get, I thought we'd
link |
01:25:15.760
get six, seven or eight. The confidence of youth. That's right. Well, the point is that this was
link |
01:25:21.920
sort of like a second system effect. That is, we now knew how to do it. And so we're going to do
link |
01:25:28.160
it much better. And we've got more ambitious and it took longer. Furthermore, there is this tendency
link |
01:25:35.680
because it's a 10 year cycle or it doesn't matter. Just before you're about to ship,
link |
01:25:45.200
somebody has a bright idea. And so we really, really must get that in. We did that successfully
link |
01:25:57.360
with the STL. We got the standard library that gives us all the STL stuff. That basically,
link |
01:26:05.680
I think it saved C++. It was beautiful. And then people tried it with other things
link |
01:26:11.520
and it didn't work so well. They got things in, but it wasn't as dramatic and it took longer and
link |
01:26:17.520
longer and longer. So after C++ 11, which was a huge improvement and what, basically what most
link |
01:26:26.720
people are using today, we decided never again. And so how do you avoid those slips? And the
link |
01:26:36.400
answer is that you ship more often. So that if you have a slip on a 10 year cycle, by the time
link |
01:26:46.320
you know it's a slip, there's 11 years till you get it. Now with a three year cycle, there is
link |
01:26:52.960
about three or four years till you get it. Like the delay between feature freeze and shipping. So
link |
01:27:02.640
you always get one or two years more. And so we shipped 14 on time, we shipped 17 on time,
link |
01:27:10.880
and we ship, we will ship 20 on time. It'll happen. And furthermore, this gives a predictability
link |
01:27:21.680
that allows the implementers, the compiler implementers, the library implementers,
link |
01:27:26.320
they have a target and they deliver on it. 11 took two years before most compilers were good
link |
01:27:34.640
enough. 14, most compilers were actually getting pretty good in 14. 17, everybody shipped in 17.
link |
01:27:45.360
We are going to have at least almost everybody ship almost everything in 20. And I know this
link |
01:27:53.200
and I know this because they're shipping in 19. Predictability is good. Delivery on time is good.
link |
01:28:01.040
And so yeah. That's great. That's how it works.
link |
01:28:05.920
There's a lot of features that came in in C++ 11. There's a lot of features at the birth of C++
link |
01:28:13.200
that were amazing and ideas with concepts in 2020. What to you is the most,
link |
01:28:20.240
just to you personally, beautiful or just you sit back and think, wow, that's just nice and clean
link |
01:28:32.640
feature of C++? I have written two papers for the History of Programming Languages Conference,
link |
01:28:41.680
which basically asked me such questions. And I'm writing a third one, which I will deliver
link |
01:28:47.520
at the History of Programming Languages Conference in London next year. So I've been thinking about
link |
01:28:53.440
that. And there is one clear answer. Constructors and destructors. The way a constructor can
link |
01:29:00.320
establish the environment for the use of a type for an object and the destructor that cleans up
link |
01:29:08.400
any messes at the end of it. That is key to C++. That's why we don't have to use garbage
link |
01:29:15.120
collection. That's how we can get predictable performance. That's how you can get the minimal
link |
01:29:22.640
overhead in many, many cases, and have really clean types. It's the idea of constructor destructor
link |
01:29:31.520
pairs. Sometimes it comes out under the name RAII. Resource acquisition is initialization,
link |
01:29:40.480
which is the idea that you grab resources in the constructor and release them in destructor.
link |
01:29:46.560
It's also the best example of why I shouldn't be in advertising. I get the best idea and I call it
link |
01:29:53.200
resource acquisition is initialization. Not the greatest naming I've ever heard.
link |
01:29:59.520
Not the greatest naming I've ever heard. So it's types, abstraction of types.
link |
01:30:11.040
You said, I want to create my own types. So types is an essential part of C++ and making them
link |
01:30:18.000
efficient is the key part. And to you, this is almost getting philosophical, but the construction
link |
01:30:27.760
and the destruction, the creation of an instance of a type and the freeing of resources from that
link |
01:30:36.400
instance of a type is what defines the object. It's almost like birth and death is what defines
link |
01:30:45.200
human life. That's right. By the way, philosophy is important. You can't do good language design
link |
01:30:53.600
without philosophy because what you are determining is what people can express and how.
link |
01:30:59.200
This is very important. By the way, constructors destructors came into C++ in 79 in about the
link |
01:31:08.160
second week of my work with what was then called C of the classes. It is a fundamental idea.
link |
01:31:15.120
Next comes the fact that you need to control copying because once you control, as you said,
link |
01:31:21.200
birth and death, you have to control taking copies, which is another way of creating an object.
link |
01:31:29.200
And finally, you have to be able to move things around so you get the move operations. And that's
link |
01:31:35.680
the set of key operations you can define on a C++ type. And so to you, those things are just
link |
01:31:45.440
just a beautiful part of C++ that is at the core of it all. Yes. You mentioned that you hope there
link |
01:31:54.240
will be one unified set of guidelines in the future for how to construct a programming language.
link |
01:32:00.000
So perhaps not one programming language, but a unification of how we build programming languages,
link |
01:32:08.480
if you remember such statements. I have some trouble remembering it, but I know the origin
link |
01:32:13.840
of that idea. So maybe you can talk about sort of C++ has been improving. There's been a lot
link |
01:32:19.360
of programming language. Do you, where does the arc of history taking us? Do you hope that there
link |
01:32:25.200
is a unification about the languages with which we communicate in the digital space?
link |
01:32:32.560
Well, I think that languages should be designed not by clobbering language features together and
link |
01:32:42.400
and doing slightly different versions of somebody else's ideas, but through the creation of a set of
link |
01:32:53.120
principles, rules of thumbs, whatever you call them. I made them for C++. And we're trying to
link |
01:33:02.560
teach people in the standards committee about these rules, because a lot of people come in
link |
01:33:07.120
and says, I've got a great idea. Let's put it in the language. And then you have to ask, why does
link |
01:33:12.720
it fit in the language? Why does it fit in this language? It may fit in another language and not
link |
01:33:18.240
here, or it may fit here and not the other language. So you have to work from a set of
link |
01:33:23.520
principles and you have to develop that set of principles. And one example that I sometimes
link |
01:33:33.920
remember is I was sitting down with some of the designers of Common Lisp and we were talking about
link |
01:33:43.600
languages and language features. And obviously we didn't agree about anything because, well,
link |
01:33:50.880
Lisp is not C++ and vice versa. It's too many parentheses. But suddenly we started making
link |
01:33:58.160
progress. I said, I had this problem and I developed it according to these ideas. And
link |
01:34:06.560
they said, why? We had that problem, different problem, and we developed it with the same kind
link |
01:34:11.680
of principles. And so we worked through large chunks of C++ and large chunks of Common Lisp
link |
01:34:21.440
and figured out we actually had similar sets of principles of how to do it. But the constraints
link |
01:34:29.840
on our designs were very different and the aims for the usage was very different. But there was
link |
01:34:37.600
commonality in the way you reason about language features and the fundamental principles you are
link |
01:34:45.200
trying to do. So do you think that's possible? So there, just like there is perhaps a unified
link |
01:34:52.240
theory of physics, of the fundamental forces of physics, that I'm sure there is commonalities
link |
01:35:00.880
among the languages, but there's also people involved that help drive the development of these
link |
01:35:06.960
languages. Do you have a hope or an optimism that there will be a unification? If you think about
link |
01:35:16.880
physics and Einstein towards a simplified language, do you think that's possible?
link |
01:35:24.560
Let's remember sort of modern physics, I think, started with Galileo in the 1300s. So they've had
link |
01:35:32.640
700 years to get going. Modern computing started in about 49. We've got, what is it, 70 years. They
link |
01:35:43.920
have 10 times. Furthermore, they are not as bothered with people using physics the way
link |
01:35:52.640
we are worried about programming is done by humans. So each have problems and constraints
link |
01:36:01.680
the others have, but we are very immature compared to physics. So I would look at sort of the
link |
01:36:09.680
philosophical level and look for fundamental principles. Like you don't leak resources,
link |
01:36:18.080
you shouldn't. You don't take errors at runtime that you don't need to. You don't violate some
link |
01:36:29.280
kind of type system. There's many kinds of type systems, but when you have one, you don't break it,
link |
01:36:35.760
etc., etc. There will be quite a few, and it will not be the same for all languages. But I think
link |
01:36:44.560
if we step back at some kind of philosophical level, we would be able to agree on sets of
link |
01:36:52.000
principles that applied to sets of problem areas. And within an area of use, like in C++'s case,
link |
01:37:05.280
what used to be called systems programming, the area between the hardware and the fluffier parts
link |
01:37:12.480
of the system, you might very well see a convergence. So these days you see Rust having
link |
01:37:19.200
adopted RAII and sometimes accuse me for having borrowed it 20 years before they discovered it.
link |
01:37:27.120
But we're seeing some kind of convergence here instead of relying on garbage collection all the
link |
01:37:38.080
time. The garbage collection languages are doing things like the dispose patterns and such that
link |
01:37:46.160
imitate some of the construction destruction stuff. And they're trying not to use the garbage
link |
01:37:52.480
collection all the time and things like that. So there's a conversion. But I think we have to step
link |
01:37:58.320
back to the philosophical level, agree on principles, and then we'll see some conversions,
link |
01:38:04.320
convergences. And it will be application domain specific.
link |
01:38:10.720
So a crazy question, but I work a lot with machine learning, with deep learning. I'm not sure if you
link |
01:38:16.560
touch that world that much, but you could think of programming as a thing that takes some input.
link |
01:38:24.480
A programming is the task of creating a program and a program takes some input and produces some
link |
01:38:29.120
output. So machine learning systems train on data in order to be able to take an input and produce
link |
01:38:37.600
output. But they're messy, fuzzy things, much like we as children grow up. We take some input,
link |
01:38:48.640
we make some output, but we're noisy. We mess up a lot. We're definitely not reliable. Biological
link |
01:38:53.760
system are a giant mess. So there's a sense in which machine learning is a kind of way of
link |
01:39:01.120
programming, but just fuzzy. It's very, very, very different than C++. Because C++ is just like you
link |
01:39:11.360
said, it's extremely reliable, it's efficient, you can measure it, you can test it in a bunch of
link |
01:39:18.240
different ways. With biological systems or machine learning systems, you can't say much except sort
link |
01:39:26.080
of empirically saying that 99.8% of the time, it seems to work. What do you think about this fuzzy
link |
01:39:34.400
kind of programming? Do you even see it as programming? Is it totally another kind of world?
link |
01:39:41.760
I think it's a different kind of world. And it is fuzzy. And in my domain, I don't like fuzziness.
link |
01:39:48.640
That is, people say things like they want everybody to be able to program. But I don't
link |
01:39:56.560
want everybody to program my airplane controls or the car controls. I want that to be done by
link |
01:40:06.400
engineers. I want that to be done with people that are specifically educated and trained for doing
link |
01:40:13.520
building things. And it is not for everybody. Similarly, a language like C++ is not for
link |
01:40:20.400
everybody. It is generated to be a sharp and effective tool for professionals, basically,
link |
01:40:30.240
and definitely for people who aim at some kind of precision. You don't have people doing
link |
01:40:37.680
calculations without understanding math. Counting on your fingers is not going to cut it if you want
link |
01:40:44.560
to fly to the moon. And so there are areas where an 84% accuracy rate, 16% false positive rate,
link |
01:40:56.560
is perfectly acceptable and where people will probably get no more than 70. You said 98%. What
link |
01:41:09.360
I have seen is more like 84. And by really a lot of blood, sweat, and tears, you can get up to 92.5.
link |
01:41:16.320
So this is fine if it is, say, prescreening stuff before the human look at it. It is not good enough
link |
01:41:27.920
for life threatening situations. And so there's lots of areas where the fuzziness is perfectly
link |
01:41:36.000
acceptable and good and better than humans, cheaper than humans, cheaper than humans.
link |
01:41:40.400
But it's not the kind of engineering stuff I'm mostly interested in. I worry a bit about
link |
01:41:48.000
machine learning in the context of cars. You know much more about this than I do.
link |
01:41:53.200
I worry too.
link |
01:41:54.160
But I'm sort of an amateur here. I've read some of the papers, but I've not ever done it. And the
link |
01:42:02.880
idea that scares me the most is the one I have heard, and I don't know how common it is, that
link |
01:42:14.640
you have this AI system, machine learning, all of these trained neural nets. And when there's
link |
01:42:24.960
something that's too complicated, they ask the human for help. But the human is reading a book or
link |
01:42:32.560
asleep, and he has 30 seconds or three seconds to figure out what the problem was that the AI
link |
01:42:41.040
system couldn't handle and do the right thing. This is scary. I mean, how do you do the cutting
link |
01:42:48.400
work between the machine and the human? It's very, very difficult. And for the designer of
link |
01:42:58.000
one of the most reliable, efficient, and powerful programming languages, C++, I can understand why
link |
01:43:05.120
that world is actually unappealing. It is for most engineers. To me, it's extremely
link |
01:43:11.920
appealing because we don't know how to get that interaction right. But I think it's possible. But
link |
01:43:18.080
it's very, very hard. It is. And I was stating a problem, not a solution. That is impossible.
link |
01:43:24.320
I mean, I would much rather never rely on the human. If you're driving a nuclear reactor,
link |
01:43:29.120
if you're or an autonomous vehicle, it's much better to design systems written in C++ than
link |
01:43:35.920
never ask human for help. Let's just get one fact in. Yeah. All of this AI stuff is on top of C++.
link |
01:43:47.760
So that's one reason I have to keep a weather eye out on what's going on in that field. But
link |
01:43:53.360
I will never become an expert in that area. But it's a good example of how you separate
link |
01:43:58.400
different areas of applications and you have to have different tools, different principles. And
link |
01:44:05.920
then they interact. No major system today is written in one language. And there are good
link |
01:44:11.200
reasons for that. When you look back at your life work, what is a moment? What is a
link |
01:44:20.800
event creation that you're really proud of? They say, damn, I did pretty good there.
link |
01:44:29.040
Is it as obvious as the creation of C++? It's obvious. I've spent a lot of time with C++. And
link |
01:44:37.920
there's a combination of a few good ideas, a lot of hard work, and a bit of work that I've done.
link |
01:44:43.120
And I've tried to get away from it a few times, but I get dragged in again, partly because I'm
link |
01:44:50.800
most effective in this area and partly because what I do has much more impact if I do it in
link |
01:44:58.400
the context of C++. I have four and a half million people that pick it up tomorrow if I
link |
01:45:05.120
get something right. If I did it in another field, I would have to start learning, then I have to
link |
01:45:13.840
build it and then we'll see if anybody wants to use it. One of the things that has kept me going
link |
01:45:21.760
for all of these years is one, the good things that people do with it and the interesting things
link |
01:45:29.280
they do with it. And also, I get to see a lot of interesting stuff and talk to a lot of interesting
link |
01:45:36.160
people. I mean, if it has just been statements on paper on a screen, I don't think I could have kept
link |
01:45:46.400
going. But I get to see the telescopes up on Mauna Kea and I actually went and see how Ford built
link |
01:45:54.400
cars and I got to JPL and see how they do the Mars rovers. There's so much cool stuff going on. And
link |
01:46:05.440
most of the cool stuff is done by pretty nice people and sometimes in very nice places.
link |
01:46:10.480
Cambridge, Sophia, Silicon Valley. There's more to it than just code. But code is central.
link |
01:46:25.360
On top of the code are the people in very nice places. Well, I think I speak for millions of
link |
01:46:32.480
people, Yaron, in saying thank you for creating this language that so many systems are built on
link |
01:46:40.800
top of that make a better world. So thank you and thank you for talking today. Yeah, thanks.
link |
01:46:47.360
And we'll make it even better. Good.