back to index

Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224


small model | large model

link |
00:00:00.000
The following is a conversation with Travis Oliphant,
link |
00:00:03.600
one of the most impactful programmers
link |
00:00:05.520
and data scientists ever.
link |
00:00:07.900
He created NumPy, SciPy, and Anaconda.
link |
00:00:12.760
NumPy formed the foundation
link |
00:00:14.500
of tensor based machine learning in Python,
link |
00:00:17.060
SciPy formed the foundation
link |
00:00:18.880
of scientific programming in Python,
link |
00:00:20.960
and Anaconda, specifically with Conda,
link |
00:00:23.980
made Python more accessible to a much larger audience.
link |
00:00:27.620
Travis's life work across a large number of programming
link |
00:00:31.200
and entrepreneurial efforts has and will continue
link |
00:00:34.760
to have immeasurable impact on millions of lives
link |
00:00:38.440
by empowering scientists and engineers
link |
00:00:41.360
in big companies, small companies,
link |
00:00:43.600
and open source communities to take on difficult problems
link |
00:00:47.200
and solve them with the power of programming.
link |
00:00:50.520
Plus, he's a truly kind human being,
link |
00:00:53.440
which is something that when combined with vision
link |
00:00:56.000
and ambition makes for a great leader
link |
00:00:58.400
and a great person to chat with.
link |
00:01:01.160
To support this podcast,
link |
00:01:02.320
please check out our sponsors in the description.
link |
00:01:04.880
This is the Lex Friedman Podcast,
link |
00:01:06.960
and here is my conversation with Travis Oliphant.
link |
00:01:11.520
What was the first computer program you've ever written?
link |
00:01:14.480
Do you remember?
link |
00:01:15.320
Whoa, that's a good question.
link |
00:01:16.920
I think it was in fourth grade.
link |
00:01:18.380
Just a simple loop in BASIC.
link |
00:01:20.920
BASIC. BASIC, yeah, on an Atari 800,
link |
00:01:23.320
Atari 400, I think, or maybe it was an Atari 800.
link |
00:01:26.840
It was a part of a class,
link |
00:01:28.300
and we just were just BASIC loops to print things out.
link |
00:01:32.560
Did you use go to statements?
link |
00:01:34.920
Yes, yes, we used go to statements.
link |
00:01:38.000
I remember in the early days,
link |
00:01:39.560
that's when I first realized
link |
00:01:41.160
there's like principles to programming,
link |
00:01:43.320
when I was told that don't use go to statements.
link |
00:01:45.720
Those are bad software engineering principles,
link |
00:01:48.080
like it goes against what great, beautiful code is.
link |
00:01:52.040
I was like, oh, okay, there's rules to this game.
link |
00:01:54.800
I didn't see that until high school
link |
00:01:56.240
when I took an AP computer science course.
link |
00:01:58.360
I did a lot of other kinds of just programming in TI,
link |
00:02:02.200
but finally, when I took an AP computer science course
link |
00:02:04.160
in Pascal.
link |
00:02:05.720
Wow.
link |
00:02:06.560
That's, yeah, it was Pascal.
link |
00:02:07.440
That's when I, oh, there are these principles.
link |
00:02:09.760
Not C or C++?
link |
00:02:11.320
No, I didn't take C until the next year in college.
link |
00:02:14.660
I had a course in C, but I haven't done much in Pascal,
link |
00:02:18.100
just that AP computer science course.
link |
00:02:20.160
Now, sorry for the romanticized question,
link |
00:02:23.480
but when did you first fall in love with programming?
link |
00:02:26.720
Oh, man, good question.
link |
00:02:27.880
I think actually when I was 10,
link |
00:02:30.280
my dad got us a TI Timex Sinclair,
link |
00:02:33.460
and he was excited about the spreadsheet capability,
link |
00:02:37.200
and then, but I made him get the basic,
link |
00:02:39.560
the add ons we could actually program in basic,
link |
00:02:41.840
and just being able to write instructions
link |
00:02:44.520
and have the computer do something.
link |
00:02:45.960
Then we got a TI 994A when I was about 12,
link |
00:02:50.080
and I would just, it had sprites and graphics and music.
link |
00:02:52.960
You could actually program it to do music.
link |
00:02:55.320
That's when I really sort of fell in love with programming.
link |
00:02:58.600
So this is a full, like a real computer
link |
00:03:01.060
with like, with memory and storage,
link |
00:03:04.120
processors and whatnot,
link |
00:03:05.240
because you say TI. Yeah, the Timex Sinclair
link |
00:03:07.360
was one of the very first, it was a cheap, cheap,
link |
00:03:09.680
like, I think it was, well, it was still expensive,
link |
00:03:12.760
but it was 2K of memory.
link |
00:03:14.440
We got the 16K add on pack,
link |
00:03:16.760
but yeah, it had memory, and you could program it.
link |
00:03:19.000
You had the, in order to store your programs,
link |
00:03:20.920
you had to attach a tape drive.
link |
00:03:22.880
Remember that old, the sound that would play
link |
00:03:24.400
when you converted the modems would convert digital bits
link |
00:03:29.440
to audio files set on a tape drive.
link |
00:03:31.920
Still remember that sound, but that was the storage.
link |
00:03:34.760
And what was the programming language, do you remember?
link |
00:03:36.480
It was basic. It was basic.
link |
00:03:37.320
And then they had a VisiCalc,
link |
00:03:38.980
and so a little bit of spreadsheet programming
link |
00:03:40.600
in VisiCalc, but mostly just some basic.
link |
00:03:42.760
Do you remember what kind of things drew you to programming?
link |
00:03:46.340
Was it working with data, was it video games?
link |
00:03:50.360
Games, math, mathy stuff?
link |
00:03:52.600
Yeah, I've always loved math,
link |
00:03:54.800
and a lot of people think they don't like math
link |
00:03:58.080
because I think when they're exposed to it early,
link |
00:04:00.440
it's about memory.
link |
00:04:02.080
When you're exposed to math early,
link |
00:04:03.260
you have a good short term memory,
link |
00:04:04.280
can remember his timetables.
link |
00:04:05.920
And I do have a reasonably, I mean, not perfect,
link |
00:04:08.600
but a reasonably long little short term memory buffer.
link |
00:04:12.480
And so I did great at timetables.
link |
00:04:14.320
I said, oh, I'm good at math.
link |
00:04:15.840
But I started to really like math,
link |
00:04:17.360
just the problem solving aspect.
link |
00:04:20.320
And so computing was problem solving applied.
link |
00:04:25.040
And so that's always kind of been the draw,
link |
00:04:28.280
kind of coupled with the mathematics.
link |
00:04:30.480
Did you ever see the computer as like an extension
link |
00:04:33.920
of your mind, like something able to achieve?
link |
00:04:36.520
Not till later.
link |
00:04:37.760
Okay.
link |
00:04:38.600
Yeah, not then.
link |
00:04:39.440
It's just like a little set of puzzles
link |
00:04:40.880
that you can play with and you can play with math puzzles.
link |
00:04:43.520
Yeah, it was too rudimentary early on.
link |
00:04:46.120
Like it was sort of, yeah, it was a lot of work
link |
00:04:49.160
to actually take a thought you'd have
link |
00:04:51.440
and actually get it implemented.
link |
00:04:53.120
And that's still work, but it's getting easier.
link |
00:04:56.020
And so yeah, I would say that's definitely
link |
00:04:58.240
what's attracting me to Python
link |
00:04:59.560
is that that was more real, right?
link |
00:05:02.140
I could think in Python.
link |
00:05:04.840
Speaking of foreign language,
link |
00:05:05.800
I only speak another language fluently besides English,
link |
00:05:08.400
which is Spanish.
link |
00:05:09.220
And I remember the day when I would dream in Spanish
link |
00:05:11.720
and you start to think in that language.
link |
00:05:13.440
And then you actually, I do definitely believe
link |
00:05:15.340
that language limits or expands your thinking.
link |
00:05:19.640
There's some languages that actually lead you
link |
00:05:21.600
to certain thought processes.
link |
00:05:23.860
Yeah, like, so I speak Russian fluently
link |
00:05:27.280
and that's certainly a language that leads you
link |
00:05:30.960
down certain thought processes.
link |
00:05:33.240
Well, yeah, I mean, there's a history
link |
00:05:36.220
of the two world wars of millions of people starving
link |
00:05:41.220
to death or near to death throughout its history
link |
00:05:44.180
of suffering, of injustice, like this promise sold
link |
00:05:48.020
to the people and then the carpet
link |
00:05:50.900
or whatever is swept from under them.
link |
00:05:53.340
And it's like broken promises.
link |
00:05:54.660
And all of that pain and melancholy is in the language,
link |
00:05:58.100
the sad songs, the sad hopeful songs,
link |
00:06:01.700
the over romanticized, like, I love you, I hate you,
link |
00:06:05.260
the sort of the swings between all the various spectrums
link |
00:06:09.980
of emotion, so that's all within the language.
link |
00:06:13.740
The way it's twisted, there's a strong culture
link |
00:06:18.020
of rhyming poetry, so like the bards,
link |
00:06:20.380
like the sync, there's a musicality to the language too.
link |
00:06:24.740
Did Dostoevsky write in Russian?
link |
00:06:27.380
Yeah, so like Dostoevsky, Tostoy, all the,
link |
00:06:32.100
all the.
link |
00:06:32.940
The ones that I know about, which are translated
link |
00:06:34.660
and I'm curious how the translations.
link |
00:06:36.340
So Dostoevsky did not use the musicality
link |
00:06:40.860
of the language too much.
link |
00:06:42.180
So it actually translates pretty well
link |
00:06:44.180
because it's so philosophically dense
link |
00:06:46.540
that the story does a lot of the work,
link |
00:06:48.460
but there's a bunch of things that are untranslatable.
link |
00:06:51.140
Certainly the poetry is not translatable.
link |
00:06:53.580
I actually have a few conversations coming up offline
link |
00:06:57.940
and also in this podcast with people
link |
00:06:59.980
who've translated Dostoevsky.
link |
00:07:01.940
And that's for people who worked, who work in this field,
link |
00:07:06.340
know how difficult that is.
link |
00:07:07.340
Sometimes you can spend months thinking
link |
00:07:10.660
about a single sentence, right?
link |
00:07:12.340
In context, like, cause there's just the magic
link |
00:07:15.220
captured by that sentence and how do you translate
link |
00:07:17.860
just in the right way?
link |
00:07:18.940
Because those words can be really powerful.
link |
00:07:22.380
There's a famous line,
link |
00:07:24.300
beauty will save the world from Dostoevsky.
link |
00:07:27.140
You know, there's so many ways to translate that.
link |
00:07:29.500
And you're right, the language gives you the tools
link |
00:07:32.700
with which to tell the story,
link |
00:07:34.140
but it also leads your mind down certain trajectories
link |
00:07:37.260
and paths to where over time,
link |
00:07:39.660
as you think in that language,
link |
00:07:41.140
you become a different human being.
link |
00:07:42.740
Yes. Yeah.
link |
00:07:43.740
Yeah, that's a fascinating reality, I think.
link |
00:07:45.860
I know people have explored that,
link |
00:07:47.020
but it's just rediscovered.
link |
00:07:49.740
Well, we don't, we live in our own like little pockets.
link |
00:07:52.340
Like this is the sad thing is I feel like unfortunately,
link |
00:07:56.860
given time and given getting older,
link |
00:07:59.140
I'll never know China, the Chinese world,
link |
00:08:03.620
because I don't truly know the language.
link |
00:08:05.780
Same with Japanese, I don't truly know Japanese
link |
00:08:08.300
and Portuguese and Brazil,
link |
00:08:10.340
that whole South American continent.
link |
00:08:12.060
Like, yeah, I'll go to Brazil and Argentina,
link |
00:08:14.460
but will I truly understand the people
link |
00:08:17.100
if I don't understand the language?
link |
00:08:18.500
It's sad because I wonder how much,
link |
00:08:23.500
how many geniuses were missing
link |
00:08:25.220
because so much of the scientific world,
link |
00:08:28.540
so much of the technical world is in English,
link |
00:08:31.460
and so much of it might be lost
link |
00:08:33.140
because it's just we don't have the common language.
link |
00:08:36.100
I completely agree.
link |
00:08:36.940
I'm very much in that vein of there's a lot of genius
link |
00:08:40.620
out there that we miss,
link |
00:08:41.780
and it's sort of fortunate when it bubbles up
link |
00:08:45.060
into something that we can understand or process,
link |
00:08:48.700
there's a lot we miss.
link |
00:08:50.420
So I tend to lean towards really loving democratization
link |
00:08:54.060
or things that empower people
link |
00:08:55.420
or very resistant sort of authoritarian structures.
link |
00:09:00.140
Fundamentally for that reason,
link |
00:09:01.900
well, several reasons, but it just hurts us.
link |
00:09:04.740
We're soft.
link |
00:09:06.420
So speaking of languages that empower you,
link |
00:09:09.020
so Python was the first language for me
link |
00:09:11.820
that I really enjoyed thinking in, as you said.
link |
00:09:16.780
Sounds like you shared my experience too.
link |
00:09:18.500
So when did you first,
link |
00:09:19.620
do you remember when you first kind of connected with Python,
link |
00:09:21.900
maybe even fell in love with Python?
link |
00:09:23.740
It's a good question.
link |
00:09:24.580
It was a process.
link |
00:09:25.500
It took about a year.
link |
00:09:26.540
I first encountered Python in 1997.
link |
00:09:29.500
I was a graduate student studying biomedical engineering
link |
00:09:31.740
at the Mayo Clinic.
link |
00:09:32.980
And I had previously,
link |
00:09:34.700
I'd been involved in taking information from satellites.
link |
00:09:39.340
I was an electrical engineering student
link |
00:09:41.340
used to taking information
link |
00:09:42.660
and trying to get something out of it,
link |
00:09:44.060
doing some data processing, getting information out of it.
link |
00:09:46.140
And I'd done that in MATLAB.
link |
00:09:47.660
I'd done that in Perl.
link |
00:09:49.140
I'd done that in scripting on a VMS.
link |
00:09:52.540
There's actually a VAX VMS system,
link |
00:09:54.260
they had their own little scripting tools around Fortran.
link |
00:09:57.980
Done a lot of that.
link |
00:09:58.820
And then as a graduate student,
link |
00:10:00.820
I was looking for something and encountered Python.
link |
00:10:04.420
And because Python had an array,
link |
00:10:06.140
had two things that made me not filter it away.
link |
00:10:09.100
Because I was filtering a bunch of stuff,
link |
00:10:10.380
as Yorick, I looked at Yorick,
link |
00:10:11.700
I looked at a few other languages that are out there
link |
00:10:14.420
at the time in 1997, but it had arrays.
link |
00:10:17.700
There's a library called Numeric
link |
00:10:19.060
that had just been written in 95,
link |
00:10:20.860
like not very, not too much earlier.
link |
00:10:23.740
By an MIT alum, Jim Huganen.
link |
00:10:26.980
You know, and I went back and read the mailing list
link |
00:10:29.100
to see the history of how it grew.
link |
00:10:30.300
And there was a very interesting,
link |
00:10:31.220
it's fascinating to do that actually,
link |
00:10:32.380
to see how this emergent cooperation,
link |
00:10:36.020
unstructured cooperation happens in the open source world
link |
00:10:39.500
that led to a lot of this collective programming,
link |
00:10:43.300
which is something maybe we might get into a little later,
link |
00:10:45.140
but what that looks like.
link |
00:10:46.100
What gap did Numeric fill?
link |
00:10:48.340
Numeric filled the gap of having an array object.
link |
00:10:50.260
There was no array object.
link |
00:10:51.580
There was no array.
link |
00:10:52.420
There was a one dimensional byte concept,
link |
00:10:55.380
but there was no n dimensional,
link |
00:10:57.580
two, three, four dimensional tensor they call it now.
link |
00:11:00.700
I'm still in the category that a tensor is another thing
link |
00:11:03.260
and it's just an ndarray we should call it,
link |
00:11:05.220
but kind of lost that battle.
link |
00:11:08.340
There's many battles in this world,
link |
00:11:10.140
some of which we win, some we lose.
link |
00:11:12.060
That's exactly right.
link |
00:11:13.620
So, but it had no math to it.
link |
00:11:17.180
So Numeric had math and a basic way to think in arrays.
link |
00:11:20.820
So I was looking for that,
link |
00:11:21.820
and it had complex numbers,
link |
00:11:24.980
a lot of programming languages.
link |
00:11:26.380
And you can see it because,
link |
00:11:28.100
if you're just a computer scientist,
link |
00:11:29.500
you think, ah, complex numbers are just two floats.
link |
00:11:32.060
So you can, people can build that on.
link |
00:11:34.980
But in practice, a complex number
link |
00:11:36.740
as one of the significant algebras
link |
00:11:38.980
that helps connect a lot of physical
link |
00:11:40.740
and mathematical ideas,
link |
00:11:42.260
particularly FFT for an electrical engineer.
link |
00:11:45.100
And it's a really important concept
link |
00:11:48.160
and not having it means you have to develop it
link |
00:11:50.820
several times and those times may not share an approach.
link |
00:11:54.300
One of the common things in programming,
link |
00:11:55.700
one of the things programming enables is abstractions.
link |
00:11:59.060
But when you have shared abstractions, it's even better.
link |
00:12:01.180
It sort of gets to the level of language
link |
00:12:02.980
of actually we all think of this the same way,
link |
00:12:05.540
which is both powerful and dangerous, right?
link |
00:12:07.940
Because powerful in that we now can quickly
link |
00:12:11.180
make bigger and higher level things
link |
00:12:13.340
on top of those abstractions dangerous
link |
00:12:14.800
because it also limits us as to the things
link |
00:12:17.100
we maybe left behind in producing that abstraction,
link |
00:12:20.500
which is at the heart of programming today
link |
00:12:21.900
and actually building around the programming world.
link |
00:12:24.420
I think it's a fascinating philosophical topic.
link |
00:12:26.580
Yeah, they will continue for many years, I think.
link |
00:12:28.380
They'll continue for many years.
link |
00:12:29.220
As we build more and more and more abstractions.
link |
00:12:31.260
Yes, I often think about, you know,
link |
00:12:32.340
we have a world that's built on these abstractions
link |
00:12:35.060
that were they the only ones possible?
link |
00:12:37.500
Certainly not, but they led to,
link |
00:12:39.860
you know, it's very hard to do it differently.
link |
00:12:42.300
Like there's an inertia that's very hard to,
link |
00:12:44.980
you know, push out, push away from.
link |
00:12:47.740
That has implications for things like,
link |
00:12:49.640
you know, the Julia language,
link |
00:12:50.720
which you have heard of, I'm sure.
link |
00:12:52.680
And I've met the creators and I liked Julia.
link |
00:12:55.700
It's a really cool language,
link |
00:12:56.580
but they struggled to kind of against the,
link |
00:12:59.300
just the tide of like this inertia of people using Python.
link |
00:13:03.420
And, you know, there's strategies to approach that,
link |
00:13:05.820
but nonetheless, it's a phenomena.
link |
00:13:07.580
And sometimes, so I love complex numbers
link |
00:13:09.580
and I love to raise, so I looked at Python.
link |
00:13:12.260
And then I had the experience, I did some stuff in Python
link |
00:13:15.260
and I was just doing my PhD.
link |
00:13:16.380
So I was out, my focus was on,
link |
00:13:19.700
I was actually doing a combination of MRI and ultrasound
link |
00:13:22.180
and looking at a phenomenon called elastography,
link |
00:13:24.740
which is you push waves into the body
link |
00:13:27.020
and observe those waves, like you can actually measure them.
link |
00:13:30.300
And then you do mathematical inversion
link |
00:13:32.780
to see what the elasticity is.
link |
00:13:35.220
And so that's the problem I was solving
link |
00:13:36.820
is how to do that with both ultrasound and MRI.
link |
00:13:39.780
I needed some tool to do that with.
link |
00:13:41.380
So I was starting to use Python in 97.
link |
00:13:44.260
In 98, I went back, looked at what I'd written
link |
00:13:47.340
and realized I could still understand it,
link |
00:13:49.560
which is not the experience I'd had
link |
00:13:50.900
when doing Perl in 95, right?
link |
00:13:53.660
I'd done the same thing and then I looked back
link |
00:13:55.620
and I forgotten what I was even saying.
link |
00:13:58.360
Now, you know, I'm not saying, so that may,
link |
00:14:00.700
hey, this may work, I like this.
link |
00:14:02.400
This is something I can retain
link |
00:14:04.980
without becoming an expert per se.
link |
00:14:07.620
And so that led me to go, I'm gonna push more into this.
link |
00:14:10.380
And then that 98 was kind of when I started
link |
00:14:14.820
to fall in love with Python, I would say.
link |
00:14:18.300
A few peculiar things about Python.
link |
00:14:20.900
So maybe compare it to Perl,
link |
00:14:22.940
compare it to some of the other languages.
link |
00:14:24.500
So there's no braces.
link |
00:14:26.320
Yeah.
link |
00:14:27.160
So space is used, indentation, I should say,
link |
00:14:31.960
is used as part of the language.
link |
00:14:33.980
Yeah, right.
link |
00:14:35.540
So did you, I mean, that's quite a leap.
link |
00:14:39.980
Were you comfortable with that leap
link |
00:14:41.180
or were you just very open minded?
link |
00:14:42.740
It's a good question.
link |
00:14:43.580
I was open minded, so I was cognizant of the concern.
link |
00:14:48.040
And it definitely has, it has specific challenges.
link |
00:14:52.060
You know, cut and pasting.
link |
00:14:53.520
For example, when you're cut and pasting code,
link |
00:14:55.460
and if your editors aren't supportive of that,
link |
00:14:57.220
if you're putting it into a terminal,
link |
00:14:58.980
and particularly in the past when terminals
link |
00:15:01.020
didn't necessarily have the intelligence to manage it now.
link |
00:15:03.140
Now, I, Python, and Jupyter Notebooks
link |
00:15:05.100
handle that just fine, so there's really no problem.
link |
00:15:06.820
But in the past, it created some challenges,
link |
00:15:08.740
formatting challenges, also mixed tabs and spaces.
link |
00:15:12.460
If editors weren't, you weren't clear
link |
00:15:14.740
on what was happening, you would have these issues.
link |
00:15:16.860
So there were really concrete reasons about it
link |
00:15:19.180
that I heard and understood.
link |
00:15:20.400
I never really encountered a problem with it personally.
link |
00:15:23.960
Like, it was occasional annoyances,
link |
00:15:26.480
but I really liked the fact
link |
00:15:28.420
that it didn't have all this extra characters, right?
link |
00:15:31.060
That these extra characters didn't show up
link |
00:15:33.100
in my visual field when I was just trying
link |
00:15:35.420
to process understanding a snippet of code.
link |
00:15:38.000
Yeah, there's a cleanness to it.
link |
00:15:39.260
But, I mean, the idea is supposed to be
link |
00:15:41.140
that Perl also has a cleanness to it
link |
00:15:43.300
because of the minimalism of how many characters
link |
00:15:46.500
it takes to express a certain thing.
link |
00:15:48.420
So it's very compact.
link |
00:15:49.820
But what you realize with that compactness comes,
link |
00:15:53.560
there's a culture that prizes compactness,
link |
00:15:57.100
and so the code gets more and more compact
link |
00:15:58.900
and less and less readable to a point where it's like,
link |
00:16:03.600
like, to be a good programmer in Perl,
link |
00:16:05.420
you write code that's basically unreadable.
link |
00:16:07.820
There's a culture, like.
link |
00:16:09.100
Correct, and you're proud of it.
link |
00:16:10.860
Yeah, you're proud of it.
link |
00:16:12.460
Right, exactly, and it's like, feels good.
link |
00:16:14.140
And it's really selective.
link |
00:16:16.660
It means you have to be an expert in Perl to understand it.
link |
00:16:20.380
Whereas Python allowed you not to have to be an expert.
link |
00:16:22.980
You didn't have to take all this brain energy.
link |
00:16:24.740
You could leverage, what I say,
link |
00:16:25.660
you could leverage your English language center,
link |
00:16:28.180
which you're using all the time.
link |
00:16:29.980
I've wondered about other languages,
link |
00:16:31.180
particularly non Latin based languages.
link |
00:16:34.680
Latin based languages with the characters are at least similar.
link |
00:16:37.220
I think people have an easier time,
link |
00:16:38.620
but I don't know what it's like to be a Japanese
link |
00:16:41.300
or a Chinese person trying to learn different syntax.
link |
00:16:46.900
Like, what would computer programming look like in that?
link |
00:16:49.740
I haven't looked at that at all,
link |
00:16:50.780
but it certainly doesn't,
link |
00:16:52.020
you know, leveraging your Chinese language center,
link |
00:16:54.300
I'm not sure Python or any programming does that.
link |
00:16:57.060
But that was a big deal.
link |
00:16:58.140
The fact that it was accessible, I could be a scientist.
link |
00:17:00.340
What I really liked is many programming languages
link |
00:17:02.900
really demand a lot of you, and you can get a lot,
link |
00:17:04.740
you know, you do a lot if you learn it.
link |
00:17:07.200
But Python enables you to do a lot
link |
00:17:08.900
without demanding a lot of you.
link |
00:17:11.180
There's nuance to that statement,
link |
00:17:13.100
but it certainly was, it's more accessible.
link |
00:17:15.340
So more people could actually, as a scientist,
link |
00:17:18.040
as somebody who, or an engineer,
link |
00:17:19.860
who was trying to solve another problem
link |
00:17:21.460
besides point programming,
link |
00:17:23.300
I could still use this language and get things done
link |
00:17:26.000
and be happy about it.
link |
00:17:27.340
And I was also comfortable in C at that time.
link |
00:17:30.100
And MATLAB, you did a little bit of that.
link |
00:17:31.340
And MATLAB, I did a lot before that, exactly.
link |
00:17:33.180
So I was comfortable in,
link |
00:17:34.900
those three languages were really the tools I used
link |
00:17:37.580
during my studies and schooling.
link |
00:17:40.540
But to your point about language helping you think,
link |
00:17:42.620
one of the big things about MATLAB was it was,
link |
00:17:44.580
and APL before it, I don't know if you remember APL.
link |
00:17:48.300
APL is actually the predecessor of array based programming,
link |
00:17:51.660
which I think is really an underappreciated,
link |
00:17:54.160
if I talk to people who are just steeped
link |
00:17:55.340
in computer programming, computer science,
link |
00:17:57.640
like most of the people that Microsoft has hired
link |
00:17:59.460
in the past, for example,
link |
00:18:01.140
Microsoft as a company generally did not understand
link |
00:18:03.900
array based programming.
link |
00:18:05.220
Like culturally, they didn't understand it.
link |
00:18:06.620
So they kept missing the boat,
link |
00:18:08.560
kept missing the understanding of what this was.
link |
00:18:11.580
They've gotten better,
link |
00:18:12.740
but there's still a whole culture of folks
link |
00:18:14.420
that doesn't, programming, that's systems programming
link |
00:18:17.980
or web programming or lists and maps.
link |
00:18:20.380
And what about an n dimensional array?
link |
00:18:22.520
Oh yeah, that's just an implementation detail.
link |
00:18:24.700
Well, you can think that,
link |
00:18:26.700
but then actually if you have that as a construct,
link |
00:18:28.800
you actually think differently.
link |
00:18:29.860
APL was the first language to understand that.
link |
00:18:31.660
And it was in the sixties, right?
link |
00:18:33.460
The challenge of APL is APL had very dense,
link |
00:18:36.780
not only glyphs, like new characters, new glyphs,
link |
00:18:39.340
but they even had a new keyboard
link |
00:18:40.480
because to produce those glyphs,
link |
00:18:42.340
this is back in the early days in computing
link |
00:18:43.980
when the QWERTY keyboard maybe wasn't as established,
link |
00:18:47.980
like, well, we can have a new keyboard, no big deal.
link |
00:18:50.780
But it was a big deal and it didn't catch on.
link |
00:18:52.860
And the language APL, very much like Perl,
link |
00:18:56.500
as people would pride themselves on how much,
link |
00:18:58.620
could they write the game of life
link |
00:18:59.740
in 30 characters of APL.
link |
00:19:03.100
APL has characters that mean summation
link |
00:19:06.060
and they have adverbs,
link |
00:19:08.180
they would have adjectives and these things called adverbs,
link |
00:19:10.060
which are like methods, like reduction,
link |
00:19:12.220
reduction would be an adverb on an ad operator, right?
link |
00:19:15.320
So, but doing, using these tools you could construct
link |
00:19:18.660
and then you start to think at that level,
link |
00:19:20.880
you think in n dimensions is something I like to say,
link |
00:19:22.900
and you start to think differently about data at that point.
link |
00:19:25.500
Now you're, it really helps.
link |
00:19:27.500
Yeah, I mean, outside of programming,
link |
00:19:30.100
if you really internalize linear algebra as a course,
link |
00:19:33.700
I mean, it's philosophically allows you
link |
00:19:35.580
to think of the world differently.
link |
00:19:37.220
It's almost like liberating, you don't have to,
link |
00:19:39.700
you don't have to think about the individual numbers
link |
00:19:42.100
in the n dimensional array.
link |
00:19:44.240
You could think of it as an object in itself
link |
00:19:46.140
and all of a sudden this world can open up.
link |
00:19:48.500
You're saying MATLAB and APL were like the early C,
link |
00:19:52.660
I don't know if many languages got that right ever.
link |
00:19:54.980
No, no, no they didn't.
link |
00:19:56.860
Even still.
link |
00:19:57.700
Even still, I would say.
link |
00:19:58.820
I mean, NumPy is an inheritor of the traditions
link |
00:20:02.540
that I would say APLJ was another version that was,
link |
00:20:06.580
what it did is not have the glyphs,
link |
00:20:08.340
just have short characters,
link |
00:20:09.700
but still a Latin keyboard could type them.
link |
00:20:11.740
And then numeric inherited from that
link |
00:20:14.540
in terms of let's add arrays plus broadcasting
link |
00:20:17.660
plus methods, reduction,
link |
00:20:19.700
even some of the language like rank is a concept
link |
00:20:21.780
that was in Python and is still in Python
link |
00:20:24.660
for the number of dimensions, right?
link |
00:20:27.180
That's different than say the rank of a matrix
link |
00:20:29.460
which people think of as well.
link |
00:20:31.140
So it came from that tradition,
link |
00:20:33.060
but NumPy is a very pragmatic, practical tool.
link |
00:20:37.980
NumPy inherited from numeric
link |
00:20:39.260
and we can get to where NumPy came from
link |
00:20:40.820
which is the current array,
link |
00:20:43.340
at least current as of 2015, 2017.
link |
00:20:46.100
Now there's a ton of them over the past two or three years.
link |
00:20:49.320
We can get into that too.
link |
00:20:50.320
So if we just linger on the early days
link |
00:20:52.780
of what was your favorite feature of Python?
link |
00:20:56.220
Do you remember like what?
link |
00:20:58.020
So it's so interesting to linger on like the,
link |
00:21:02.260
what really makes you connect with a language?
link |
00:21:06.300
I'm not sure it's obvious to introspect that.
link |
00:21:09.400
No, it isn't.
link |
00:21:10.240
And I've thought about that at some length.
link |
00:21:12.860
I think definitely the fact that I could read it later,
link |
00:21:16.460
that I could use it productively
link |
00:21:18.140
without becoming an expert.
link |
00:21:19.820
Other language I had to put more effort into.
link |
00:21:22.180
That's like an empirical observation.
link |
00:21:23.940
Like you're not analyzing any one aspect of the language.
link |
00:21:26.500
It just seems time after time when you look back,
link |
00:21:29.460
it's somehow readable.
link |
00:21:30.580
It's somehow readable.
link |
00:21:31.420
Then it was sort of, I could take executable English
link |
00:21:35.380
and translate it to Python more easily.
link |
00:21:36.820
Like I didn't have to go, there was no translation layer.
link |
00:21:39.760
As an engineer or as a scientist,
link |
00:21:41.580
I could think about what I wanted to do.
link |
00:21:43.240
And then the syntax wasn't that far behind it, right?
link |
00:21:46.780
Now there are some warts there still.
link |
00:21:49.220
It wasn't perfect.
link |
00:21:50.600
Like there's some areas where I'm like,
link |
00:21:51.440
ah, it'd be better if this were different
link |
00:21:52.820
or if this were different.
link |
00:21:54.380
Some of those things got added to the language too.
link |
00:21:56.580
I was really grateful for some of the early pioneers
link |
00:21:58.580
in the Python ecosystem back,
link |
00:22:00.220
because Python got written in 91.
link |
00:22:01.900
That's when the first version came out.
link |
00:22:03.140
But Guido was very open to users.
link |
00:22:06.540
And one of the sets of users were people like Jim Huganen
link |
00:22:08.660
and David Asher and Paul Dubois and Conrad Hinson.
link |
00:22:13.460
These were people that were on the main list.
link |
00:22:15.380
And they were just asking for things like,
link |
00:22:16.860
hey, we really should have complex numbers in this language.
link |
00:22:19.220
So let's, you know, there's a J, there's a one J, right?
link |
00:22:22.540
And the fact that they went the engineering route of J
link |
00:22:24.340
is interesting.
link |
00:22:26.660
I don't think that's entirely favoring engineers.
link |
00:22:28.620
I think it's because I is so often used
link |
00:22:30.460
as the index of a for loop.
link |
00:22:32.100
So I think that's actually why.
link |
00:22:34.260
Probably, I mean, there's a pragmatic aspect.
link |
00:22:36.740
But the fact that complex numbers were there, I love that.
link |
00:22:39.100
The fact that I could write in the array constructs
link |
00:22:41.460
and that reduction was there,
link |
00:22:42.820
very simple to write summations and broadcasting was there.
link |
00:22:46.540
I could do addition of whole arrays.
link |
00:22:49.440
So that was cool.
link |
00:22:50.380
Those are some things I loved about it.
link |
00:22:52.660
I don't know what to start talking to you about
link |
00:22:54.820
because you've created so many incredible projects
link |
00:22:57.860
that basically changed the whole landscape of programming.
link |
00:23:00.180
But okay, let's start with,
link |
00:23:02.380
let's go chronologically with SciPy.
link |
00:23:06.060
You created SciPy over two decades ago now?
link |
00:23:09.100
Yes, yes, I love to talk about SciPy.
link |
00:23:11.140
SciPy was really my baby.
link |
00:23:12.980
What is it?
link |
00:23:14.420
What was its goal?
link |
00:23:15.420
What is its goal?
link |
00:23:16.420
How does it work?
link |
00:23:17.260
Yeah, fantastic.
link |
00:23:18.100
So SciPy was effectively, here I am using Python
link |
00:23:21.580
to do stuff that I previously used MATLAB to use.
link |
00:23:24.980
And I was using numeric, which is an array library
link |
00:23:26.860
that made a lot of it possible.
link |
00:23:28.300
But there's things that were missing.
link |
00:23:29.900
Like I didn't have an ordinary differential equation solver
link |
00:23:32.100
I could just call, right?
link |
00:23:33.460
I didn't have integration.
link |
00:23:35.260
Hey, I wanted to integrate this function.
link |
00:23:37.180
Okay, well, I don't have just a function
link |
00:23:38.780
I can call to do that.
link |
00:23:40.580
These are things I remember being critical things
link |
00:23:42.540
that I was missing.
link |
00:23:43.700
Optimization.
link |
00:23:44.580
I just wanna pass a function to an optimizer
link |
00:23:46.780
and have it tell me what the optimal value is.
link |
00:23:50.100
Those are things I'm like, well,
link |
00:23:51.100
why don't we just write a library that adds these tools?
link |
00:23:54.340
And I started to post on the mailing list
link |
00:23:55.740
and there'd previously been, people have discussed,
link |
00:23:58.100
I remember Conrad Henson saying,
link |
00:23:59.140
wouldn't it be great if we had this optimizer library
link |
00:24:00.980
or David Ashwood say this stuff.
link |
00:24:02.580
And I'm a ambitious, ambitious is the wrong word,
link |
00:24:06.940
an eager and probably more time than sense.
link |
00:24:11.340
I was a poor graduate student.
link |
00:24:13.620
My wife thinks I'm working on my PhD and I am,
link |
00:24:15.860
but part of the PhD that I loved
link |
00:24:17.220
was the fact that it's exploratory.
link |
00:24:19.180
You're not just taking orders,
link |
00:24:21.540
fulfilling a list of things to do,
link |
00:24:23.500
you're trying to figure out what to do.
link |
00:24:25.740
And so I thought, well, I'm running tools
link |
00:24:27.900
for my own use and a PhD,
link |
00:24:29.140
so I'll just start this project.
link |
00:24:32.140
And so in 99, 98 was when I first started
link |
00:24:34.940
to write libraries for Python.
link |
00:24:36.620
Definitely when I fell in love with Python 98,
link |
00:24:38.260
I thought, oh, well, there's just a few things missing.
link |
00:24:39.740
Like, oh, I need a reader to read DICOM files.
link |
00:24:42.700
I was in medical imaging and DICOM was a format
link |
00:24:44.580
that I want to be able to load that into Python.
link |
00:24:46.940
Okay, how do I write a reader for that?
link |
00:24:48.180
So I wrote something called, it was an IO package, right?
link |
00:24:51.700
And that was my very first extension module, which is C.
link |
00:24:55.140
So I wrote C code to extend Python
link |
00:24:57.060
so that in Python I could write things more easily.
link |
00:24:59.660
That combination kind of hooked me.
link |
00:25:02.260
It was the idea that I could,
link |
00:25:03.300
here's this powerful tool I can use as a scripting language
link |
00:25:05.700
and a high level language to think about,
link |
00:25:07.460
but that I can extend easily, easily in C,
link |
00:25:11.420
easily for me because I knew enough C.
link |
00:25:13.780
And then Guido had written a link.
link |
00:25:15.260
I mean, the only, the hard part of extending Python
link |
00:25:17.220
was something called the way memory management networks,
link |
00:25:19.500
and you have to do reference counting.
link |
00:25:21.060
And so there's a tracking of reference counting
link |
00:25:23.820
you have to do manually.
link |
00:25:25.500
And if you don't, you have memory leaks.
link |
00:25:27.500
And so that's hard.
link |
00:25:29.020
Plus then C, you know, it's just much more,
link |
00:25:31.020
you have to put more effort into it.
link |
00:25:32.180
It's not just, I have to now think about pointers
link |
00:25:34.700
and I have to think about stuff that is different.
link |
00:25:37.620
I have to kind of,
link |
00:25:38.460
you're like putting a new cartridge in your brain.
link |
00:25:40.620
Like, okay, I'm thinking about MRI.
link |
00:25:42.380
Now I'm thinking about programming.
link |
00:25:43.580
And there are distinct modules
link |
00:25:45.340
you end up having to think about.
link |
00:25:46.620
So it's harder.
link |
00:25:47.460
And when I was just in Python,
link |
00:25:48.300
I could just think about MRI and high level writing,
link |
00:25:51.500
but I could do that.
link |
00:25:52.340
And that kind of, I liked it.
link |
00:25:54.020
I found that to be enjoyable and fun.
link |
00:25:55.780
And so I ended up, oh,
link |
00:25:57.220
well, let me just add a bunch of stuff to Python
link |
00:25:59.020
to do integration.
link |
00:26:00.580
Well, and the cool thing is,
link |
00:26:01.660
is that the power of the internet,
link |
00:26:03.060
just looking around and I found,
link |
00:26:04.300
oh, there's this NetLive,
link |
00:26:06.300
which has hundreds of 4chan routines
link |
00:26:08.860
that people have written in the 60s and the 70s and the 80s
link |
00:26:12.260
in 4chan 77, fortunately, it wasn't 4chan 16.
link |
00:26:14.900
So it had been ported to 4chan 77.
link |
00:26:18.100
And 4chan 77 is actually a really great language.
link |
00:26:21.660
4chan 90 probably is my favorite 4chan
link |
00:26:24.100
because it's also, it's got complex numbers,
link |
00:26:26.100
got arrays and it's pretty high level.
link |
00:26:27.700
Now, the problem with it
link |
00:26:28.980
is you'd never want to write a program in 4chan 90
link |
00:26:31.020
or 4chan 77,
link |
00:26:32.260
but it's totally fine to write a subroutine in, right?
link |
00:26:34.900
And so, and then 4chan kind of got a little off course
link |
00:26:37.660
when they tried to compete with C++.
link |
00:26:39.060
But at the time,
link |
00:26:40.580
I just want libraries to do something like,
link |
00:26:42.340
oh, here's an ordinary differential equation.
link |
00:26:43.940
Here's integration.
link |
00:26:44.900
Here's runge cut integration.
link |
00:26:46.780
Already done.
link |
00:26:47.620
I don't have to think about that algorithm.
link |
00:26:48.780
I mean, you could,
link |
00:26:49.620
but it's nice to have somebody who's already done one
link |
00:26:51.020
and tested it.
link |
00:26:51.860
And so I sort of started this journey in 98, really.
link |
00:26:55.060
If you look back at the mailing list,
link |
00:26:55.980
there's sort of this productive era of me
link |
00:26:59.660
writing an extension module
link |
00:27:01.100
to connect runge cut integration to Python
link |
00:27:04.580
and making an ordinary differential equation solver.
link |
00:27:06.660
And then releasing that as a package.
link |
00:27:09.140
So we could call ODE pack, I think I called it then.
link |
00:27:11.820
Quad pack.
link |
00:27:12.660
And then I just made these packages.
link |
00:27:14.420
Eventually that became multipack
link |
00:27:16.260
because they're originally modular.
link |
00:27:17.580
You can install them separately.
link |
00:27:19.140
But a massive problem in Python
link |
00:27:20.700
was actually just getting your stuff installed.
link |
00:27:23.420
At the time, releasing software for me,
link |
00:27:25.820
like today it's people think, what does that mean?
link |
00:27:27.580
Well, then it meant some poorly written webpage.
link |
00:27:30.780
I had some bad webpage up and I put a tarball,
link |
00:27:33.100
just a GZIP tarball of source code.
link |
00:27:35.780
That was the release.
link |
00:27:37.140
But okay, can we just stand that?
link |
00:27:39.180
Because the community aspect
link |
00:27:43.060
of creating the package and sharing that, that's rare.
link |
00:27:47.820
That, to have, to both have the, at that time,
link |
00:27:50.940
so like the raw.
link |
00:27:51.780
Yeah, it was pretty early, yeah.
link |
00:27:52.740
Oh, well, not rare.
link |
00:27:54.660
Maybe you can correct me on this,
link |
00:27:57.020
but it seems like in the scientific community,
link |
00:27:59.660
so many people, you were basically solving the problems
link |
00:28:02.420
you needed to solve to process the particular application,
link |
00:28:07.100
the data that you need.
link |
00:28:08.540
And to also have the mind
link |
00:28:10.900
that I'm going to make this usable for others, that's.
link |
00:28:15.340
I would say I was inspired.
link |
00:28:16.500
I'd been inspired by Linux,
link |
00:28:18.060
been inspired by Linus and him making his code available.
link |
00:28:21.820
And I was starting to use Linux at the time.
link |
00:28:23.260
And I went, this is cool.
link |
00:28:24.460
So I'd kind of been previously primed that way.
link |
00:28:27.060
And generally I was into science
link |
00:28:29.180
because I liked the sharing notion.
link |
00:28:30.980
I liked the idea of, hey, let's,
link |
00:28:32.660
if collectively we build knowledge and share it,
link |
00:28:34.780
we can all be better off.
link |
00:28:35.740
Okay, so you want to energize by that idea.
link |
00:28:37.420
So I was energized by that idea already, right?
link |
00:28:39.540
And I can't deny that I was.
link |
00:28:40.940
I'm sort of had this very,
link |
00:28:42.900
I liked that part of science, that part of sharing.
link |
00:28:45.700
And then all of a sudden, oh, wait, here's something.
link |
00:28:47.300
And here's something I could do.
link |
00:28:49.940
And then I slowly over years learned how to share better
link |
00:28:52.780
so that you could actually engage more people faster.
link |
00:28:55.100
One of the key things was actually giving people a binary
link |
00:28:57.100
they could install, right?
link |
00:28:58.980
So that it wasn't just your source code, good luck.
link |
00:29:01.460
Compile this and then.
link |
00:29:02.660
It's compiled, ready to install, just, you know.
link |
00:29:05.180
So in fact, a lot of the journey from 98,
link |
00:29:07.380
even through 2012 when I started Anaconda was about that.
link |
00:29:10.780
Like it's why, you know, it's really the key
link |
00:29:13.260
as to why a scientist with dreams of doing MRI research
link |
00:29:17.460
ended up starting a software company
link |
00:29:19.500
that installs software.
link |
00:29:22.260
I work with a few folks now that don't program
link |
00:29:26.700
like on the creative side and the video side,
link |
00:29:28.580
the audio side.
link |
00:29:29.620
And because my whole life is running on scripts,
link |
00:29:32.500
I have to try to get them,
link |
00:29:34.020
I'm having all the task of teaching them
link |
00:29:35.900
how to do Python enough to run the scripts.
link |
00:29:39.220
And so I've been actually facing this,
link |
00:29:40.820
whether it's Anaconda or some with the task of
link |
00:29:44.220
how do I minimally explain basically to my mom
link |
00:29:46.780
how to write a Python script.
link |
00:29:48.900
And it's an interesting challenge.
link |
00:29:50.500
I have to, it's a to do item for me to figure out like,
link |
00:29:53.020
what is the minimal amount of information I have to teach?
link |
00:29:56.340
What are the tools you use that one, you enjoy it,
link |
00:29:59.700
two, you're effective at it.
link |
00:30:00.540
And they're related, those are two related questions.
link |
00:30:02.540
And then the debugging, like the iterative process
link |
00:30:05.500
of running the script to figure out what the error is,
link |
00:30:07.820
maybe even for some people to do the fix yourself.
link |
00:30:11.580
So do you compile it?
link |
00:30:12.660
Do you, like how do you distribute that code to them?
link |
00:30:15.620
And it's interesting because I think
link |
00:30:18.540
it's exactly what you're talking about.
link |
00:30:20.100
If you increase the circle of empathy,
link |
00:30:24.260
the circle of people that are able to use your programs,
link |
00:30:28.900
you increase it, it's like effectiveness and it's power.
link |
00:30:32.900
And so you have to think, can I write scripts?
link |
00:30:37.020
Can I write programs that can be used by medical engineers,
link |
00:30:40.140
by all kinds of people that don't know programming
link |
00:30:43.900
and actually maybe plant a seed,
link |
00:30:46.900
have them catch the bug of programming
link |
00:30:48.380
so that they start on a journey.
link |
00:30:50.180
That's a huge responsibility.
link |
00:30:51.500
And ultimately it has to do with the Amazon one click buy.
link |
00:30:55.340
Like how frictionless can you make the early steps?
link |
00:30:58.780
Frictionless is actually really key.
link |
00:31:00.380
To go in any community is, any friction point,
link |
00:31:03.020
you're just gonna lose some people, right?
link |
00:31:05.180
Now sometimes you may wanna intentionally do that.
link |
00:31:09.060
If you're early enough on, you need a lot of help.
link |
00:31:11.620
You need people who have the skills.
link |
00:31:13.340
You might actually, it's helpful.
link |
00:31:14.740
You don't necessarily have too many users
link |
00:31:16.820
as opposed to contributors if you're early on.
link |
00:31:20.340
Anyway, there's, SciFi started in 98,
link |
00:31:23.100
but it really emerged as this collection of modules
link |
00:31:25.740
that I was just putting on the net.
link |
00:31:27.340
People were downloading and I think I got 100 users, right?
link |
00:31:31.580
By the end of that year.
link |
00:31:32.660
But the fact that I got 100 users and more than that,
link |
00:31:35.660
people started to email me with fixes.
link |
00:31:39.420
And that was actually intoxicating, right?
link |
00:31:41.300
That was the, here I'm writing papers
link |
00:31:44.220
and I'm giving conferences and I get people to say hello,
link |
00:31:46.180
but yeah, good job.
link |
00:31:47.420
But mostly it was, you're viewed with,
link |
00:31:49.860
it's competitive, right?
link |
00:31:51.540
You publish a paper and people are like,
link |
00:31:52.900
oh, it wasn't my paper.
link |
00:31:55.900
I was starting to see that sense of academic life
link |
00:31:59.220
where it was so much,
link |
00:32:00.180
I thought there was this cooperative effort,
link |
00:32:01.460
but it sounds like we're here just to one up each other.
link |
00:32:04.940
And it's not true across the board,
link |
00:32:07.700
but a lot of that's there.
link |
00:32:08.580
But here in this world,
link |
00:32:09.660
I was getting responses from people all over the world.
link |
00:32:13.700
I remember Pjaro Peterson in Estonia, right?
link |
00:32:16.060
Was one of the first people.
link |
00:32:17.340
And he sent me back this make file,
link |
00:32:18.740
cause the first thing it is, yeah, your build thing stinks
link |
00:32:21.220
and here's a better make file.
link |
00:32:23.020
Now it was a complex make file.
link |
00:32:24.380
I don't think I never understood that make file actually,
link |
00:32:26.580
but it worked and it did a lot more.
link |
00:32:29.220
And so I said, thanks, this is cool.
link |
00:32:30.980
And that was my first kind of engagement
link |
00:32:32.500
with community development.
link |
00:32:35.100
But the process was, he sent me a patch file.
link |
00:32:37.660
I had to upload a new tar ball.
link |
00:32:39.900
And I just found, I really love that.
link |
00:32:41.580
And the style back then was here's a mailing list.
link |
00:32:43.660
It's very, it wasn't as,
link |
00:32:45.740
it's certainly weren't the tools that are available today.
link |
00:32:47.660
It was very early on, but I really started to,
link |
00:32:49.940
that's the whole year.
link |
00:32:50.780
I think I did about seven packages that year, right?
link |
00:32:54.580
And then by the end of the year,
link |
00:32:55.540
I collected them into a thing called multipack.
link |
00:32:57.840
So in 99, there was this thing called multipack.
link |
00:32:59.780
And that's when a high school student,
link |
00:33:01.820
no, he was a high school student at the time,
link |
00:33:03.060
guy named Robert Kern,
link |
00:33:04.780
took that package and made a Windows installer, right?
link |
00:33:09.700
And then of course, a massive increase of usage.
link |
00:33:12.700
So by the way, most of this development was under Linux.
link |
00:33:15.860
Yes, yes, it was on Linux.
link |
00:33:17.380
I was a Linux developer doing it on a Unix box.
link |
00:33:20.240
I mean, at the time I was actually getting into,
link |
00:33:23.020
I had a new hard drive,
link |
00:33:24.060
did some kernel programming to make the hard drive work.
link |
00:33:26.500
I mean, not programming, but modification to the kernel
link |
00:33:28.780
so I could actually get a hard drive working.
link |
00:33:31.180
I love that aspect of it.
link |
00:33:32.320
I was also in, at school, I was building a cluster.
link |
00:33:36.100
I took Mac computers and you put yellow dog Linux on them.
link |
00:33:40.940
At the Mayo Clinic, they were just,
link |
00:33:42.140
they had all these Macs that were older,
link |
00:33:43.520
they were just getting rid of.
link |
00:33:44.740
And so I kind of got permission to go grab them together.
link |
00:33:46.820
I put about 24 of them together in a cluster, in a cabinet,
link |
00:33:50.340
and put yellow dog Linux on them all.
link |
00:33:51.700
And I wrote a C++ program to do MRI simulation.
link |
00:33:56.240
That was what I was doing at the same time
link |
00:33:58.900
for my day job, so to speak.
link |
00:34:01.400
So I was loving the whole process.
link |
00:34:03.460
And the same time I was,
link |
00:34:04.300
oh, I need a ordinary differential equation.
link |
00:34:06.260
That's why ordinary differential equations were key
link |
00:34:08.160
was because that's the heart of a block equation
link |
00:34:09.820
for simulating MRI, is an ODE solver.
link |
00:34:12.420
And so that's, but I actually did that,
link |
00:34:15.720
it just happened at the same time.
link |
00:34:16.980
That's why it was kind of what you're working on
link |
00:34:18.540
and what you're interested in, they're coinciding.
link |
00:34:20.500
I was definitely scratching my own itch
link |
00:34:22.380
in terms of building stuff.
link |
00:34:24.060
And which helped in the sense that I was using it for me,
link |
00:34:27.040
so at least I had one user.
link |
00:34:28.540
I had one person who was like, well, no, this is better.
link |
00:34:30.360
I like this interface better.
link |
00:34:31.420
And I had the experience of MATLAB
link |
00:34:33.300
to guide some of what those APIs might look like.
link |
00:34:36.480
But you're just doing yourself,
link |
00:34:37.720
you're building all this stuff.
link |
00:34:39.000
But with the Windows installer,
link |
00:34:40.060
it was the first time I realized, oh yeah,
link |
00:34:41.460
the binary installer really helps people.
link |
00:34:43.740
And so that led to spending more time
link |
00:34:46.980
on that side of things.
link |
00:34:49.100
So around 2000, so I graduated my PhD in 2000,
link |
00:34:52.780
end of year, end of 2000.
link |
00:34:53.780
So 99 doing a lot of work there,
link |
00:34:56.660
98 doing a lot of work there,
link |
00:34:57.740
99 kind of spending more time on my PhD,
link |
00:35:00.780
helping people use the tools,
link |
00:35:02.420
thinking about what do I want to go from here.
link |
00:35:04.060
There was a company, there was a guy actually,
link |
00:35:05.620
Eric Jones and Travis Vought.
link |
00:35:07.620
They were two friends who founded a company called NTHOT.
link |
00:35:11.080
It's here in Austin, still here.
link |
00:35:13.620
And they, Eric contacted me at the time
link |
00:35:16.060
when I was a graduate student still.
link |
00:35:19.380
And he said, hey, why don't you come down?
link |
00:35:20.860
We want to build a company.
link |
00:35:22.660
We're thinking of a scientific company
link |
00:35:25.720
and we want to take what you're doing
link |
00:35:27.560
and kind of add it to some stuff that he'd done.
link |
00:35:29.460
He'd written some tools.
link |
00:35:31.220
And then Piero Peterson had done F2Py.
link |
00:35:32.820
Let's come together and build,
link |
00:35:34.380
pull this all together and call it SciPy.
link |
00:35:36.740
So that's the origin of the SciPy brand.
link |
00:35:39.480
It came from multi pack
link |
00:35:41.380
and a whole bunch of modules I'd written,
link |
00:35:42.580
plus a few things from some other folks
link |
00:35:44.500
and then pulled together in a single installer.
link |
00:35:47.580
SciPy was really a distribution of Python
link |
00:35:49.540
masquerading as a library.
link |
00:35:51.260
How did you think about SciPy in context of Python,
link |
00:35:54.340
in context of Numeric, like what?
link |
00:35:56.180
So we saw SciPy as a way to make an R&D environment
link |
00:35:59.020
for Python, like use Python, depended on Numeric.
link |
00:36:03.380
So Numeric was the array library we depended on.
link |
00:36:05.540
And then from there, extend it with a bunch of modules
link |
00:36:08.260
that allowed for, and at the time,
link |
00:36:10.340
the original vision of SciPy was to have plotting,
link |
00:36:13.180
was to have the REPL environment
link |
00:36:16.140
and kind of really a whole data environment
link |
00:36:19.500
that you could then install and get going with.
link |
00:36:21.020
And that was kind of the thinking.
link |
00:36:23.020
It didn't really evolve that way, right?
link |
00:36:25.020
It sort of had a, for one,
link |
00:36:27.580
it's really hard to do massive scale projects
link |
00:36:31.940
with open source collectives.
link |
00:36:34.300
Actually, there's sort of an intrinsic cooperation limit
link |
00:36:38.500
as to which, too many cooks in the kitchen,
link |
00:36:40.780
you can do amazing infrastructure work.
link |
00:36:42.780
When it comes down to bringing it all together
link |
00:36:44.220
into a single deliverable,
link |
00:36:45.860
that actually requires a little more product management
link |
00:36:49.660
that is not, that doesn't really emerge
link |
00:36:52.820
from the same dynamic.
link |
00:36:53.980
So it struggled, struggled to get almost too many voices.
link |
00:36:57.860
It's hard to have everybody agree.
link |
00:36:59.220
Consensus doesn't really work at that scale.
link |
00:37:02.100
You end up with politics,
link |
00:37:03.260
with the same kind of things that's happened
link |
00:37:05.220
in large organizations trying to decide
link |
00:37:07.100
what to do together.
link |
00:37:09.380
So consensus building was challenging at scale
link |
00:37:12.340
as more people came in, right?
link |
00:37:13.860
Early on, it's fine, because there's nobody there.
link |
00:37:15.700
So it works, but then as you get more successful
link |
00:37:17.740
and more people use it, all of a sudden,
link |
00:37:18.980
oh, there's this scale at which this doesn't work anymore
link |
00:37:22.300
and we have to come up with different approaches.
link |
00:37:23.980
So Sidepy came out officially in 2001,
link |
00:37:26.700
was the first release, most of the time.
link |
00:37:28.900
I remember the days of getting that release ready.
link |
00:37:31.060
It was a Windows installer and there were bugs
link |
00:37:33.420
on how the Windows compiler handled complex numbers
link |
00:37:36.300
and you were chasing segmentation faults.
link |
00:37:38.540
And it was, it's a lot of work.
link |
00:37:40.420
There was a lot of effort had nothing to do
link |
00:37:43.140
with my area of study.
link |
00:37:45.540
And at the same time, I had just gotten an offer.
link |
00:37:47.500
So he wondered if I wanted to come down
link |
00:37:48.780
and help him start that company with his friend.
link |
00:37:51.460
And at the time I was like, I was intrigued,
link |
00:37:53.380
but I was squaring a path, an academic path.
link |
00:37:56.620
And I had just got an offer to go and teach at my alma mater.
link |
00:37:59.980
So I took that tenure track position.
link |
00:38:02.420
And Sidepy, and kind of, then I started to work on Sidepy
link |
00:38:05.180
as a professor too.
link |
00:38:07.060
So that's, I left, I've got the Mayo Clinic,
link |
00:38:09.540
graduated, wrote my thesis using Sidepy,
link |
00:38:11.700
wrote, you know, there's images that were created.
link |
00:38:15.500
Now the plotting tool I used was something
link |
00:38:17.300
from Yorick actually.
link |
00:38:18.660
It was a plotting, a PLT kind of a plotting language
link |
00:38:21.940
that I used.
link |
00:38:22.780
Yorick is a programming language?
link |
00:38:23.940
It was a programming language, had a plotting tool,
link |
00:38:26.340
Dyslin, it had integration to Dyslin.
link |
00:38:28.940
I ended up using Dyslin plus some of the plotting
link |
00:38:31.340
from Yorick linked to from Python.
link |
00:38:33.740
Anyway, it was, people don't plot that way now,
link |
00:38:37.180
but this is before, and Sidepy was trying to add plotting.
link |
00:38:40.260
Yeah. Right?
link |
00:38:41.460
It didn't have much success.
link |
00:38:42.580
Really the success of plotting came from John Hunter,
link |
00:38:45.580
who had a similar experience to my experience,
link |
00:38:47.420
my kind of maverick experience as a person
link |
00:38:49.660
just trying to get stuff done and kind of having more time
link |
00:38:51.700
than money maybe, right?
link |
00:38:53.820
And John Hunter created what?
link |
00:38:55.300
MapPlotLib.
link |
00:38:56.300
He's the creator of MapPlotLib.
link |
00:38:57.140
Yeah, so John Hunter was, you know,
link |
00:38:59.140
he wasn't a student at the time, but he was an,
link |
00:39:00.580
he was working in Quant field and he said,
link |
00:39:02.120
we need better plotting.
link |
00:39:03.500
So he just went out and said, cool, I'll make a new project
link |
00:39:05.540
and we'll call it MapPlotLib.
link |
00:39:06.580
And he released in 2001,
link |
00:39:08.260
about the same time that Sidepy came out
link |
00:39:09.920
and it was separate library, separate install,
link |
00:39:12.960
use numeric, Sidepy use numeric.
link |
00:39:15.540
And so Sidepy, you know, in 2001, we released Sidepy
link |
00:39:18.980
and then Endthought created a conference called Sidepy,
link |
00:39:22.380
which was brought people together to talk about the space.
link |
00:39:25.460
And that conference is still ongoing.
link |
00:39:26.700
It's one of the favorite conferences of a lot of people
link |
00:39:28.460
because it's, you know, it's changed over the years,
link |
00:39:30.820
but early on it was, you know, a collection of 50 people
link |
00:39:33.740
who care about, scientists mostly, you know,
link |
00:39:36.700
practicing scientists who want, who care about coding
link |
00:39:39.300
and doing it well and not using MATLAB.
link |
00:39:42.140
And I remember being driven by, you know, I liked MATLAB,
link |
00:39:44.120
but I didn't like the fact that,
link |
00:39:46.420
so I'm not opposed to proprietary software.
link |
00:39:48.060
I'm actually not an open source zealot.
link |
00:39:50.220
I love open source for the, what it brings,
link |
00:39:52.660
but I also see the role for proprietary software.
link |
00:39:54.460
But what I didn't like was the fact that I would develop
link |
00:39:56.580
code and publish it and then effectively telling somebody
link |
00:39:59.940
here to run my code, you have to have
link |
00:40:01.420
this proprietary software.
link |
00:40:02.500
Right, and there's also culture around MATLAB as much,
link |
00:40:05.940
because I've talked to a few folks in,
link |
00:40:08.260
MathWorks creates MATLAB?
link |
00:40:09.820
Yeah.
link |
00:40:10.820
I mean, there's just a culture, they try really hard,
link |
00:40:13.900
but it just, there's this corporate IBM style culture
link |
00:40:16.820
that's like, or whatever.
link |
00:40:18.380
I don't want to say negative things about IBM or whatever,
link |
00:40:20.780
but there's a...
link |
00:40:22.260
No, it's really that connection.
link |
00:40:23.740
It's something I'm in the middle of right now
link |
00:40:24.940
is the business of open source.
link |
00:40:27.000
And how do you connect the ethos of cooperative development
link |
00:40:30.820
with the necessity of creating profits, right?
link |
00:40:34.780
And like right now today, I'm still in the middle of that.
link |
00:40:38.060
That's actually the early days of me exploring this question.
link |
00:40:42.260
Cause I was writing SciPy, I mean, as an aside,
link |
00:40:44.660
I also had, so I had three kids at the time.
link |
00:40:46.540
I have six kids now.
link |
00:40:47.860
I got married early, wanted a family.
link |
00:40:50.860
I had three kids and I remember reading,
link |
00:40:52.620
I read Richard Stallman's post and I was a fan of Stallman.
link |
00:40:55.540
I would read his work, I liked this collective ideas
link |
00:40:58.100
he would have.
link |
00:40:58.940
Certainly the ideas on IP law, I read a lot of his stuff.
link |
00:41:01.740
But then he said, okay, well,
link |
00:41:04.820
how do I make money with this?
link |
00:41:05.780
How do I make a living?
link |
00:41:06.700
How do I pay for my kids?
link |
00:41:07.740
All this stuff was in my mind,
link |
00:41:09.300
young graduate student making no money,
link |
00:41:10.640
thinking I got to get a job.
link |
00:41:12.060
And he said, well, I think just be like me
link |
00:41:14.540
and don't have kids, right?
link |
00:41:15.840
That's just, don't, don't.
link |
00:41:17.080
That's his take on it.
link |
00:41:18.540
That was what he said in that moment, right?
link |
00:41:20.860
That's the thing I read and I went,
link |
00:41:22.420
okay, this is a train I can't get on.
link |
00:41:24.960
There has to be a way to preserve the culture
link |
00:41:26.700
of open source and still be able to make sufficient money
link |
00:41:29.180
to feed your kids.
link |
00:41:30.020
Yes, exactly, there's gotta be.
link |
00:41:31.500
Well, so that actually led me to a study of economics.
link |
00:41:34.500
Because at the time I was ignorant and I really was.
link |
00:41:36.680
I'm actually, I'm embarrassed for educational system
link |
00:41:39.420
that they could let me and I was valedictorian
link |
00:41:41.300
in my high school class and I did super well in college.
link |
00:41:43.720
And like academically I did great, right?
link |
00:41:47.620
But the fact that I could do that and then be clueless
link |
00:41:49.980
about this key part of life,
link |
00:41:52.740
it led me to go, there's a problem.
link |
00:41:54.400
Like I should have learned this in fifth grade.
link |
00:41:56.660
I should have learned this in eighth grade.
link |
00:41:58.380
Like everybody should come out
link |
00:41:59.220
with a basic knowledge of economics.
link |
00:42:01.700
You're an interesting example because you've created tools
link |
00:42:04.040
that change the lives of probably millions of people
link |
00:42:07.640
and the fact that you don't understand at the time
link |
00:42:10.060
of the creation of those tools, the basics economics
link |
00:42:12.860
of how like to build up a giant system is the problem.
link |
00:42:15.260
Yeah, it's a problem.
link |
00:42:16.100
And so during my PhD at the same time,
link |
00:42:18.260
this is back in 98, 99 at the same time,
link |
00:42:20.720
I was in a library, I was reading books on capitalism,
link |
00:42:23.380
I was reading books on Marxism,
link |
00:42:24.700
I was reading books on what is this thing?
link |
00:42:27.700
What does it mean?
link |
00:42:29.700
And I encountered, basically I encountered a set of writings
link |
00:42:33.140
from people that said they were the inheritors of Adam Smith.
link |
00:42:35.500
Read Adam Smith for the first time, right?
link |
00:42:37.220
Which is the wealth of nations
link |
00:42:38.580
and kind of this notion of emergent societies
link |
00:42:42.460
and realized, oh, there's this whole world out here
link |
00:42:45.100
of people and the challenge of economics is also political.
link |
00:42:49.460
Like, cause economics, people, different parties
link |
00:42:53.940
running for office, they want their economic friends.
link |
00:42:58.080
They want their economists to back them up, right?
link |
00:43:00.040
Or to be their magicians, like the magicians
link |
00:43:03.700
in Pharaoh's court, right?
link |
00:43:04.660
The people that are kind of say, hey, this is,
link |
00:43:06.260
you should listen to me because I've got the expert
link |
00:43:08.100
who says this.
link |
00:43:09.420
And so it gets really muddled, right?
link |
00:43:11.540
But I was looking at it from as a scientist going,
link |
00:43:14.020
what is this space?
link |
00:43:14.860
What does this mean?
link |
00:43:15.680
How does Paris get fed?
link |
00:43:16.940
How does, what is money?
link |
00:43:18.420
How does it work?
link |
00:43:19.420
And I found a lot of writings that I really loved.
link |
00:43:21.580
I found some things that I really loved
link |
00:43:22.860
and I learned from that.
link |
00:43:23.980
It was writings from people like Von Missess.
link |
00:43:26.300
He wrote a paper in 1920 that still should be read
link |
00:43:29.060
more than it is.
link |
00:43:29.900
It was the economic calculation problem
link |
00:43:33.060
of the socialist commonwealth.
link |
00:43:34.560
It was basically in response
link |
00:43:35.420
to the Bolshevik revolution in 1917.
link |
00:43:37.140
And his basic argument was it's not gonna work
link |
00:43:40.180
to not have private property.
link |
00:43:41.780
You're not gonna be able to come up with prices.
link |
00:43:43.420
The bureaucrats aren't gonna be able to determine
link |
00:43:45.200
how to allocate resources without a price system.
link |
00:43:47.620
And a price system emerges from people making trades.
link |
00:43:51.700
And they can only make trades if they have authority
link |
00:43:53.860
over the thing they're trading.
link |
00:43:55.460
And that creates information flow
link |
00:43:58.020
that you just don't have if you try to top down it.
link |
00:44:01.300
Right.
link |
00:44:02.140
And it's like, huh, that's a really good point.
link |
00:44:04.780
Yeah, the prices have a signal that's used.
link |
00:44:06.860
And it's important to have that signal
link |
00:44:09.400
when you're trying to build a community
link |
00:44:11.020
of productive people like you would
link |
00:44:12.580
in the software engineering space.
link |
00:44:13.700
Yeah, the prices are actually
link |
00:44:14.860
an important signaling mechanism.
link |
00:44:17.540
Right, and that money is just a bartering tool.
link |
00:44:20.820
Right, so this is the first time I've encountered
link |
00:44:22.540
any of this concept, right, and the fact that,
link |
00:44:24.440
oh, this is actually really critical.
link |
00:44:26.600
Like it's so critical to our prosperity
link |
00:44:29.340
and that we're dangerously not learning about this,
link |
00:44:34.100
not teaching our children about this.
link |
00:44:36.140
So you had the three kids,
link |
00:44:37.260
you had to make some hard decisions.
link |
00:44:38.080
I had to make some money, right, had to figure it out.
link |
00:44:39.880
But I didn't really care.
link |
00:44:40.720
I mean, I've never been driven by money, just need it.
link |
00:44:43.260
Yeah, right, need to eat.
link |
00:44:45.200
So how did that resolve itself in terms of site buy?
link |
00:44:49.100
So I would say it didn't really resolve itself.
link |
00:44:51.320
It sort of started a journey that I'm continuing on.
link |
00:44:53.420
I'm still on, I would say.
link |
00:44:54.740
I don't think it resolved itself.
link |
00:44:55.660
But I will say I went in eyes wide open.
link |
00:44:59.260
Like I knew that there were problems
link |
00:45:00.940
with giving stuff away and creating the market externalities
link |
00:45:07.900
that the fact that, yeah, people might use it
link |
00:45:09.780
and I might not get paid for it
link |
00:45:10.820
and I'll have to figure something else out to get paid.
link |
00:45:13.060
Like at least I can say I'm not bitter
link |
00:45:14.940
that a lot of people have used stuff that I've written
link |
00:45:17.220
and I haven't necessarily benefited economically from it.
link |
00:45:20.240
I've heard other people be bitter about that
link |
00:45:22.300
when they write or they talk.
link |
00:45:23.300
Like, oh, I should've got more value out of this.
link |
00:45:24.900
And I'm also, I want to create systems
link |
00:45:27.740
that let people like me who might have these desires
link |
00:45:31.060
to do things, let them benefit.
link |
00:45:32.260
So it actually creates more of the same.
link |
00:45:34.700
Not to turn on your bitterness module,
link |
00:45:36.900
but there's some aspect, I wish there was mechanisms for me
link |
00:45:40.940
to reward whoever created side buy and non buy
link |
00:45:43.580
because it brought so much joy to my life.
link |
00:45:45.300
I appreciate that.
link |
00:45:46.140
You know what I mean?
link |
00:45:46.980
The tip dark notion was there.
link |
00:45:48.340
I appreciate that.
link |
00:45:49.180
But there should be a very frictionless mechanism.
link |
00:45:51.940
There should be a frictionless mechanism.
link |
00:45:52.760
I totally agree.
link |
00:45:53.600
I would love to talk about some of the ideas I have
link |
00:45:55.220
because I actually came across,
link |
00:45:56.220
I think I've come up with some interesting notions
link |
00:45:58.200
that could work, but they'll require anything that will work
link |
00:46:01.860
takes time to emerge, right?
link |
00:46:03.740
Like things don't just turn overnight.
link |
00:46:04.940
That's definitely one thing I've also understood
link |
00:46:06.340
and learned is any fixes, that's why it's kind of funny.
link |
00:46:10.120
We often give credit to, oh, this president gets elected
link |
00:46:12.940
and oh, look how great things have done.
link |
00:46:14.420
And I saw that when I had a transition in a condo
link |
00:46:18.340
when a new CEO came in, right?
link |
00:46:19.520
And it's like the success that's happening,
link |
00:46:22.340
there's an inertia there.
link |
00:46:23.460
Yeah, and sometimes the decision you made
link |
00:46:25.740
like 10 years before is the reason why the success is the.
link |
00:46:28.980
Right, exactly.
link |
00:46:29.820
So we're sort of just running around taking credit
link |
00:46:31.560
for stuff.
link |
00:46:32.400
The credit assignment has like a delay to it
link |
00:46:35.140
that makes the credit assignment basically wrong
link |
00:46:38.320
more than right.
link |
00:46:39.160
Wrong more than right, exactly.
link |
00:46:40.320
And so I'm like, oh, this is, you know,
link |
00:46:42.140
that's the stuff I would read a ton about, you know,
link |
00:46:44.860
early on.
link |
00:46:45.700
So I don't, I feel like I'm with you.
link |
00:46:47.720
Like I want the same thing.
link |
00:46:48.780
I want to be able to, and honestly, not for personally,
link |
00:46:50.900
I've been happy.
link |
00:46:51.740
I've been happy.
link |
00:46:52.720
I feel like I don't have any, I mean,
link |
00:46:53.980
we've been done reasonably okay, but I've had to pursue it.
link |
00:46:56.920
Like that's really what started my trajectory from academia
link |
00:47:01.380
is reading that stuff led me to say,
link |
00:47:02.940
oh, entrepreneurship matters.
link |
00:47:05.780
So I love software, but we need more entrepreneurs
link |
00:47:09.180
and I wanna understand that better.
link |
00:47:10.360
So once I kind of had that virus infect my brain,
link |
00:47:16.500
even though I was on a trajectory
link |
00:47:17.580
to go to a tenure track position at a university
link |
00:47:20.640
and I was there for six years,
link |
00:47:22.780
I was kind of already out the door when I started.
link |
00:47:26.060
And we can get into that, but.
link |
00:47:27.660
Well, can I just ask you a quick question on,
link |
00:47:30.340
is there some design principles
link |
00:47:32.740
that were in your mind around SciPy?
link |
00:47:34.740
Like, is there some key ideas
link |
00:47:36.460
that were just like sticking to you
link |
00:47:38.060
that this is the fundamental ideas?
link |
00:47:40.300
Yeah, I would say so.
link |
00:47:41.140
I would think it's basically accessibility to scientists,
link |
00:47:43.680
like give them, give scientists and engineers tools
link |
00:47:46.980
that they don't have to think a lot about programming.
link |
00:47:48.380
So give them really good building blocks,
link |
00:47:50.300
give them functions that they wanna call
link |
00:47:51.860
and sort of just the right length of spelling.
link |
00:47:55.860
There's one tradition in programming where it's like,
link |
00:47:59.500
make very, very long names, right?
link |
00:48:01.880
And you can see it in some programming languages
link |
00:48:03.700
where the names get, take half the screen.
link |
00:48:06.460
And in the 4chan world, characters had to be six letters
link |
00:48:11.540
early on, right?
link |
00:48:12.380
And that's way too much, too little.
link |
00:48:14.340
But I was like, I liked to have names
link |
00:48:16.820
that were informative but short.
link |
00:48:18.940
So even though Python, well this is a different conversation,
link |
00:48:22.340
but documentation is doing some work there.
link |
00:48:25.860
So when you look at great scientific libraries
link |
00:48:29.180
and functions, there's a richness of documentation
link |
00:48:32.700
that helps you get into the details.
link |
00:48:34.820
The first glance at a function gives you the intuition
link |
00:48:37.620
of all it needs to do by looking at the headers and so on.
link |
00:48:40.540
But to get the depths of all the complexities involved,
link |
00:48:43.420
all the options involved,
link |
00:48:44.740
documentation does some of the work.
link |
00:48:45.580
Documentation is essential, yeah.
link |
00:48:47.380
So that was actually a, so we thought about several things.
link |
00:48:50.520
One is we wanted plotting.
link |
00:48:51.940
We wanted interactive environment.
link |
00:48:53.580
We wanted good documentation.
link |
00:48:54.860
These are things we knew, we wanted.
link |
00:48:56.780
The reality is those took about 10 years to evolve, right?
link |
00:49:00.460
Given the fact that we didn't have a big budget,
link |
00:49:02.060
it was all volunteer labor.
link |
00:49:03.100
It was sort of, when nthought got created
link |
00:49:06.980
and they started to try to find projects,
link |
00:49:10.060
people would pay for pieces
link |
00:49:11.080
and they were able to fund some of it.
link |
00:49:13.740
Not nearly enough to keep up with what was necessary.
link |
00:49:15.780
And no criticism, just simply the reality.
link |
00:49:18.860
I mean, it's hard to start a business
link |
00:49:21.180
and then do consulting and then also
link |
00:49:23.220
promote an open source project that's still fairly new.
link |
00:49:26.180
Cypo is fairly niche.
link |
00:49:27.780
We stayed connected all while I was a student,
link |
00:49:30.140
sorry, a professor.
link |
00:49:30.980
I went to BYU and started to teach.
link |
00:49:32.340
Electrical engineering, all the applied math courses.
link |
00:49:35.060
I loved teaching single processing,
link |
00:49:36.980
probability theory, electromagnetism.
link |
00:49:39.180
I was, if you look at writing my professor,
link |
00:49:40.940
which my kids loved to do,
link |
00:49:42.500
I wasn't, I got some bad reviews because people.
link |
00:49:46.900
What was the criticism?
link |
00:49:48.580
I would speak too high of a level.
link |
00:49:50.920
Like I definitely had a calibration problem
link |
00:49:52.640
coming out of graduate work
link |
00:49:54.980
where I hate to be condescending to people.
link |
00:49:56.980
Like I really have a ton of respect for people fundamentally.
link |
00:49:59.300
Like my fundamental thing is I respect people.
link |
00:50:02.060
Sometimes that can lead to a,
link |
00:50:03.900
I was thinking they had more knowledge than they did.
link |
00:50:07.640
And so I would just speak at a very high level,
link |
00:50:10.100
assume they got it.
link |
00:50:11.060
But they need to rise to the standard that you set.
link |
00:50:14.340
I mean, that's one of the,
link |
00:50:15.260
some of the greatest teachers do that.
link |
00:50:17.180
And I agree.
link |
00:50:18.020
And that was kind of what was inspiring me.
link |
00:50:19.760
But you also have to,
link |
00:50:22.160
I cannot say I was articulate
link |
00:50:24.820
with some of the greatest teachers, right?
link |
00:50:26.300
I was, like one classic example,
link |
00:50:28.540
when I first taught at BYU,
link |
00:50:30.420
my very first class, it was overheads,
link |
00:50:31.980
transparencies, overheads.
link |
00:50:34.100
Before projectors were really that common,
link |
00:50:35.940
I taught transparencies.
link |
00:50:37.100
I'm writing my notes out.
link |
00:50:38.260
I go in, room's half dark.
link |
00:50:40.540
I just blaring through these transparencies.
link |
00:50:42.780
Here it is, here it is, here it is.
link |
00:50:44.900
And I did give a quiz after two weeks.
link |
00:50:47.480
No one knew anything.
link |
00:50:48.900
Nothing I had taught had gotten anywhere.
link |
00:50:50.940
And I realized, okay, I'm not, this is not working.
link |
00:50:54.140
So I put away the transparencies
link |
00:50:56.380
and I turned around and just started using the chalkboard.
link |
00:50:58.860
And what it did is it slowed me down, right?
link |
00:51:00.980
The chalkboard just slowed me down
link |
00:51:02.260
and gave people time to process and to think.
link |
00:51:04.440
And then that made me focus.
link |
00:51:06.080
My writing wasn't great on the chalkboard,
link |
00:51:07.900
but I really love that part of like the teaching.
link |
00:51:10.520
So that entered SciPy's world in terms of,
link |
00:51:12.500
we always understood that there's a didactic aspect
link |
00:51:14.860
of SciPy, kind of how do you take the knowledge
link |
00:51:17.740
and then produce it?
link |
00:51:18.640
The challenge we had was the scope.
link |
00:51:21.020
Like ultimately SciPy was everything, right?
link |
00:51:23.420
And so 2001, when it first came out,
link |
00:51:25.600
people were starting to use it.
link |
00:51:26.800
No, this is cool, this is a tool we actually use.
link |
00:51:29.580
At the same time, 2001 timeframe,
link |
00:51:31.400
there was a little bit of like the Hubble Space Telescope,
link |
00:51:33.940
the folks at Hubble that started to say,
link |
00:51:35.400
hey, Python, we're gonna use Python
link |
00:51:36.620
for processing images from Hubble.
link |
00:51:38.720
And so Perry Greenfield was a good friend
link |
00:51:40.820
in running that program.
link |
00:51:42.420
And he had called me before I left WIU and said,
link |
00:51:45.060
you know, we wanna do this,
link |
00:51:47.020
but numeric actually has some challenges in terms of,
link |
00:51:50.020
you know, it's not, the array doesn't have enough types.
link |
00:51:52.700
We need more operations.
link |
00:51:54.280
You know, broadcasting needs to be a little more settled.
link |
00:51:56.660
They wanted record arrays.
link |
00:51:57.960
They wanted, you know, record arrays are like a data frame,
link |
00:52:00.600
but a little bit different,
link |
00:52:02.220
but they wanted more structured data.
link |
00:52:03.820
So he had called me even early on then,
link |
00:52:06.020
and he said, you know, what,
link |
00:52:06.860
would you wanna work on something to make this work?
link |
00:52:08.300
And I said, yeah, I'm interested, but I'm going here,
link |
00:52:10.140
and I, you know, we'll see if I have time.
link |
00:52:12.100
So in the meantime, while I was teaching
link |
00:52:13.340
and SciPy was emerging, and I had a student,
link |
00:52:15.660
I was constantly, while I was teaching,
link |
00:52:16.840
trying to figure a way to fund this stuff.
link |
00:52:18.840
So I had a graduate student, my only graduate student,
link |
00:52:21.660
a Chinese fellow, Liu Hongze is his name, great guy.
link |
00:52:26.260
He wrote a bunch of stuff for iterative linear algebra,
link |
00:52:29.900
like got into writing some of the iterative
link |
00:52:31.380
linear algebra tools that are currently there in SciPy,
link |
00:52:34.340
and they've gotten better since,
link |
00:52:36.040
but this is in 2005, kept working on SciPy,
link |
00:52:39.260
but Perry has started working on a replacement
link |
00:52:43.060
to numeric called NumArray.
link |
00:52:45.300
And in 2004, a package called ND Image,
link |
00:52:49.020
it was an image processing library
link |
00:52:50.740
that was written for NumArray,
link |
00:52:53.220
and it had in it a morphology tool.
link |
00:52:55.580
I don't know if you know what morphology is.
link |
00:52:56.740
It's open, dilations, closed, you know,
link |
00:52:58.540
there was sort of this, as a medical imaging student,
link |
00:53:01.660
I knew what it was,
link |
00:53:02.500
because it was used in segmentation a lot.
link |
00:53:04.420
And in fact, I'd wanted to do something like that
link |
00:53:06.460
in Python, in SciPy, but just had never gotten around to it.
link |
00:53:10.220
So when it came out, but it worked only on NumArray,
link |
00:53:14.180
and SciPy needed numeric,
link |
00:53:16.420
and so we effectively had the beginning of this split.
link |
00:53:20.040
And numeric and NumArray didn't share data,
link |
00:53:22.500
they were just two, so you could have a gigabyte
link |
00:53:24.420
of numeric, NumArray data, and gigabyte of numeric data,
link |
00:53:26.540
and they wouldn't share it.
link |
00:53:27.380
And so you had these,
link |
00:53:28.500
then you had these scientific libraries written on top.
link |
00:53:31.300
I got really bugged by that.
link |
00:53:32.940
I got really like, oh man, this is not good,
link |
00:53:35.060
we're not cooperating now,
link |
00:53:36.300
we're sort of redoing each other's work,
link |
00:53:37.980
and we're just this young community.
link |
00:53:40.380
So that's what led me, even though I knew it was risky,
link |
00:53:43.940
because my, you know, I was on a tenure track position,
link |
00:53:47.140
2004 I got reviewed.
link |
00:53:48.540
They said, hey, things are going okay,
link |
00:53:49.540
you're doing well, paper's coming out,
link |
00:53:51.540
but you're kind of spending a lot of time
link |
00:53:52.460
doing this open source stuff, maybe do a little less of that,
link |
00:53:54.780
and a little more of the paper writing and grant writing,
link |
00:53:57.260
which was naive, but it was definitely the thinking.
link |
00:54:00.860
It still goes on.
link |
00:54:01.700
Still goes on.
link |
00:54:03.060
You're basically creating a thing
link |
00:54:05.120
which enables science in the 21st century.
link |
00:54:08.300
Right.
link |
00:54:09.340
Maybe don't emphasize that so much in your free year tenure.
link |
00:54:11.980
Right.
link |
00:54:13.460
It illustrates some of the challenges.
link |
00:54:14.860
Yes.
link |
00:54:15.700
It does, and it's, people mean well.
link |
00:54:18.220
Yes.
link |
00:54:19.060
Like, but we've gotten broken in a bunch of ways.
link |
00:54:22.340
Certain things, programming,
link |
00:54:23.660
understanding the role of software engineering,
link |
00:54:25.500
programming in society is a little bit lacking.
link |
00:54:27.860
Exactly.
link |
00:54:28.700
Now, I was in electrical engineering position.
link |
00:54:30.020
Right.
link |
00:54:30.860
That's even worse there.
link |
00:54:33.140
Yeah, it was very, they were very focused,
link |
00:54:34.700
and so, you know, good people, and I had a great time,
link |
00:54:37.300
I loved my time, I loved my teaching,
link |
00:54:38.940
I loved all the things I did there.
link |
00:54:40.460
The problem was, the split was happening
link |
00:54:42.540
in this community that I loved, right?
link |
00:54:43.940
I saw people, and I went, oh my gosh,
link |
00:54:45.460
this is gonna be, this is not great,
link |
00:54:47.780
and so I happened, you know, fate,
link |
00:54:50.020
I had a class I had signed up for,
link |
00:54:52.620
it's a, I was trying to build an MRI system,
link |
00:54:54.860
so I had a kind of a radio, instead of a radio,
link |
00:54:58.300
a digital radio class, it was a digital MRI class.
link |
00:55:01.820
And I had people sign up, two people signed up,
link |
00:55:04.020
then they dropped, and so I had nobody in this class.
link |
00:55:06.660
So, and I didn't have any other courses to teach,
link |
00:55:08.820
and I thought, oh, I've got some time,
link |
00:55:10.940
and I'll just write, I'll just write a replace,
link |
00:55:13.100
a merger of Numerica Numeray.
link |
00:55:14.820
Like, I'll basically take the numeric code base
link |
00:55:16.980
at the features Numeray was adding,
link |
00:55:19.220
and then kind of come up with a single array library
link |
00:55:21.180
that everybody can use.
link |
00:55:22.460
So that's where NumPy came from,
link |
00:55:24.140
was my thinking, hey, I can do this,
link |
00:55:26.500
and who else is going to?
link |
00:55:27.860
Because at that point, I'd been around the community
link |
00:55:29.260
long enough, and I'd written enough C code,
link |
00:55:30.820
I knew, I knew the structures, and I,
link |
00:55:33.260
in fact, my first contribution to numeric
link |
00:55:35.060
had been writing the CAPI documentation
link |
00:55:38.580
that went in the first documentation for NumPy,
link |
00:55:41.080
for numeric, sorry, this is Paul DuBois,
link |
00:55:43.020
David Asher, Conrad Hinson, and myself.
link |
00:55:45.100
I got credit because I wrote this chapter,
link |
00:55:47.580
which is all the CAPI of Numerica, all the C stuff.
link |
00:55:51.260
So I said, I'm probably the one to do it,
link |
00:55:53.380
and nobody else is gonna do this.
link |
00:55:54.760
So it was sort of, out of a sense of duty and passion,
link |
00:55:58.340
knowing that, eh, I don't think my academic,
link |
00:56:01.460
I don't think the department here is gonna appreciate this,
link |
00:56:03.940
but it's the right thing to do.
link |
00:56:06.020
It was like.
link |
00:56:06.860
Can we just link on that moment?
link |
00:56:08.660
Yeah, yeah.
link |
00:56:09.500
Because the importance of the way you thought
link |
00:56:11.740
and the action you took, I feel is understated
link |
00:56:16.360
and is rare and I would love to see so much more of it
link |
00:56:19.900
because what happens as the tools become more popular,
link |
00:56:24.820
there's a split that happens.
link |
00:56:27.180
And it's a truly heroic and impactful action
link |
00:56:30.940
to in those early, in that early split,
link |
00:56:33.580
to step up and it's like great leaders throughout history,
link |
00:56:37.820
like get, what is the brave heart,
link |
00:56:39.660
like get on a horse and rile the troops
link |
00:56:42.500
because I think that can have, make a big difference.
link |
00:56:46.060
We have TensorFlow versus PyTorch
link |
00:56:48.180
in the machine learning community.
link |
00:56:49.100
We have the same problem today.
link |
00:56:50.380
Yeah, I wonder.
link |
00:56:51.780
It's actually bigger.
link |
00:56:52.620
I wonder if it's possible in the early days
link |
00:56:56.620
to rally the troops.
link |
00:56:58.220
It is possible, especially in the early days.
link |
00:57:00.020
The longer it goes, the harder, right?
link |
00:57:01.620
The more energy in the factions, the harder.
link |
00:57:03.940
But in the early days, it is possible
link |
00:57:05.700
and it's extremely helpful
link |
00:57:07.660
and there's a willingness there,
link |
00:57:09.100
but the challenge is there's just not a willingness
link |
00:57:11.740
to fund it.
link |
00:57:12.980
There's not a willingness to, you know,
link |
00:57:14.880
like I was literally walking into a field
link |
00:57:17.540
saying I'm going to do this
link |
00:57:18.620
and here I am, like, you know,
link |
00:57:20.140
I have five kids at home now.
link |
00:57:23.740
Pressure builds.
link |
00:57:24.820
Sometimes my wife hears these stories
link |
00:57:26.220
and she's like, you did what?
link |
00:57:29.020
I thought we were going to,
link |
00:57:29.860
I thought you were actually on a path
link |
00:57:31.460
to make sure we had resources and money, but,
link |
00:57:34.100
but again, there's a, there's an aspect,
link |
00:57:36.420
I'm a very hopeful person.
link |
00:57:37.860
I'm an optimistic person by nature.
link |
00:57:39.680
I love people.
link |
00:57:41.120
I learned that about myself later on.
link |
00:57:43.140
And part of my, my religious beliefs
link |
00:57:47.220
actually lead to that.
link |
00:57:48.380
And it's why I hold them dear
link |
00:57:49.880
because it's actually how I feel about,
link |
00:57:51.300
that's what leads me to these attitudes,
link |
00:57:53.420
sort of this hopefulness and this sense of,
link |
00:57:55.900
yeah, it may not work out for me financially
link |
00:57:58.600
or maybe, but that's not the ultimate gain.
link |
00:58:00.600
Like that's a thing, but it's not,
link |
00:58:02.940
that's not the scorecard for me.
link |
00:58:05.540
And so I just wanted to be helpful
link |
00:58:07.060
and I knew, and partly because these SciPy conferences,
link |
00:58:09.280
because the maintenance conversations,
link |
00:58:10.860
I knew there was a lot of need for this, right?
link |
00:58:13.300
And so I had this, it wasn't like I was alone
link |
00:58:15.460
in terms of no feedback.
link |
00:58:16.460
I had these people who knew, but it was crazy.
link |
00:58:19.440
Like people who at the time said,
link |
00:58:20.700
yeah, we didn't think you'd be able to do it.
link |
00:58:22.340
We thought it was crazy.
link |
00:58:23.160
And also instructive, like practically speaking,
link |
00:58:26.720
that you had a cool feature
link |
00:58:28.700
that you were chasing the morphology, like the.
link |
00:58:30.820
Yes.
link |
00:58:31.660
Like it's not just like.
link |
00:58:32.500
There's an end result.
link |
00:58:33.460
It's not some visionary thing.
link |
00:58:35.140
I'm going to unite the community.
link |
00:58:36.820
You were like. Correct.
link |
00:58:38.060
You were actually practically,
link |
00:58:39.520
this is what one person actually could do
link |
00:58:42.100
and actually build.
link |
00:58:43.220
Cause that is important.
link |
00:58:44.220
Cause you can get over your skis.
link |
00:58:47.460
You can definitely get over your skis.
link |
00:58:49.060
And I had, in fact, this almost got me over my skis, right?
link |
00:58:52.140
I would say, well, in retrospect, I hate looking back.
link |
00:58:56.140
I can tell you all the flaws with NumPy, right?
link |
00:58:58.540
When I go into it, there's lots of stuff that I'm like,
link |
00:59:00.700
oh man, that's embarrassing.
link |
00:59:01.660
That was wrong.
link |
00:59:02.500
I wish I had somebody stop me with a wet fish there.
link |
00:59:04.300
Like I needed, like what I'd wished I'd had
link |
00:59:07.020
was somebody with more experience and certainly library
link |
00:59:10.460
writing and array library.
link |
00:59:11.540
There's like, I wish I had me.
link |
00:59:12.780
I could go back in time and go do this, do that.
link |
00:59:14.520
There's a more important thing.
link |
00:59:15.480
Cause there's things we did that are still there
link |
00:59:18.100
that are problematic, that created challenges for later.
link |
00:59:20.940
And I didn't know it at the time.
link |
00:59:22.460
Didn't understand how important that was.
link |
00:59:24.420
And in many cases, didn't know what to do.
link |
00:59:26.460
Like there was pieces of the design of NumPy.
link |
00:59:29.060
I didn't know what to do until five years ago.
link |
00:59:31.340
Now I know what they should have been, Ben.
link |
00:59:32.860
But I didn't know at the time and nobody,
link |
00:59:33.960
and I couldn't get the help.
link |
00:59:35.380
Anyway, so I wrote it.
link |
00:59:36.660
It took about, it took four months to write
link |
00:59:38.780
the first version, then about 14 months to make it usable.
link |
00:59:43.360
But it was, it wasn't, it was that first four months
link |
00:59:45.860
of intense writing, coding, getting something out the door
link |
00:59:49.320
that worked that was, it was, it was definitely challenging.
link |
00:59:52.380
And then the big thing I did was create a new type object
link |
00:59:54.900
called D type.
link |
00:59:56.100
That was probably the contribution.
link |
00:59:58.780
And then the fact that I added broad, not just broadcasting,
link |
01:00:01.900
but advanced indexing so that you could do masked indexing
link |
01:00:06.500
and indirect indexing instead of just slicing.
link |
01:00:09.940
So for people who don't know, and maybe you can elaborate,
link |
01:00:13.020
NumPy, I guess the vision in the narrowest sense
link |
01:00:17.660
is to have this object that represents
link |
01:00:21.460
n dimensional arrays.
link |
01:00:23.180
And like at any level of abstraction you want,
link |
01:00:26.300
but basically it could be a black box
link |
01:00:28.220
that you can investigate in ways that you would naturally
link |
01:00:30.940
want to investigate such objects.
link |
01:00:33.340
Yes, exactly.
link |
01:00:34.180
So you could do math on it easily.
link |
01:00:35.740
Math on it easily, yeah.
link |
01:00:37.180
So it had an associated library of math operations
link |
01:00:39.860
and effectively SciPy became an even larger operate set
link |
01:00:43.220
of math operations.
link |
01:00:44.940
So the key for me was I was going to write NumPy
link |
01:00:48.020
and then move SciPy to depend on NumPy.
link |
01:00:50.340
In fact, early on, one of the initial proposals
link |
01:00:52.980
was that we would just write SciPy
link |
01:00:54.540
and it would have the numeric object inside of it.
link |
01:00:56.660
And it'd be SciPy.array or something.
link |
01:00:59.780
That turned out to be problematic because numeric
link |
01:01:02.180
already had a little mini library of linear algebra
link |
01:01:04.820
and some functions, and it had enough momentum,
link |
01:01:08.020
enough users that nobody wanted to,
link |
01:01:10.340
they wanted backward compatibility.
link |
01:01:12.060
One of the big challenges of NumPy
link |
01:01:13.740
was I had to be backward compatible
link |
01:01:14.980
with both numeric and NumArray
link |
01:01:16.980
in order to allow both of those communities to come together.
link |
01:01:19.300
There was a ton of work in creating
link |
01:01:21.140
that backward compatibility
link |
01:01:22.580
that also created echoes in today's object.
link |
01:01:25.420
Like some of the complexity in today's object
link |
01:01:27.180
is actually from that goal of backward compatibility
link |
01:01:30.060
to these other communities,
link |
01:01:31.380
which if you didn't have that, you'd do something different,
link |
01:01:34.620
which is instructive because a lot of things are there.
link |
01:01:37.740
You think, what is that there for?
link |
01:01:38.940
It's like, well, it's a remnant.
link |
01:01:41.380
It's an artifact of its historical existence.
link |
01:01:45.220
By the way, I love the empathy
link |
01:01:46.780
and the lack of ego behind that
link |
01:01:48.460
because I feel, you see that in the split
link |
01:01:51.420
in the JavaScript framework, for example,
link |
01:01:53.340
the arbitrary branching.
link |
01:01:54.860
Right.
link |
01:01:56.980
I think in order to unite people,
link |
01:01:59.020
you have to kind of put your ego aside
link |
01:02:00.620
and truly listen to others.
link |
01:02:02.260
You do.
link |
01:02:03.100
What do you love about NumArray?
link |
01:02:04.820
What do you love about Numeric?
link |
01:02:06.020
Like actually get a sense,
link |
01:02:07.460
we were talking about languages earlier,
link |
01:02:08.860
sort of empathize to the culture,
link |
01:02:11.100
the people that love something about this particular API,
link |
01:02:14.660
some of the naming style
link |
01:02:18.100
or the actual usage patterns
link |
01:02:21.220
and truly understand them
link |
01:02:22.820
and so that you can create that same draw
link |
01:02:26.780
in the united thing. I completely agree.
link |
01:02:28.620
I completely agree.
link |
01:02:29.460
And you have to also have enough passion
link |
01:02:31.780
that you'll do it.
link |
01:02:32.620
It can't be just like a perfunctory,
link |
01:02:34.660
oh yes, I'll listen to you
link |
01:02:36.500
and then I'm not really that excited about it.
link |
01:02:38.380
So it really is an aspect,
link |
01:02:39.620
it's a philosophical, like there's a philia,
link |
01:02:42.260
there's a love of esteeming of others.
link |
01:02:44.260
It's actually at the heart of what,
link |
01:02:47.060
it's sort of a life philosophy for me, right?
link |
01:02:49.220
That I'm constantly pursuing and that helped,
link |
01:02:51.540
absolutely helped.
link |
01:02:52.660
Makes me wonder in a philosophical,
link |
01:02:54.260
like looking at human civilization as one object,
link |
01:02:57.460
it makes me wonder how we can copy and paste Travis's
link |
01:02:59.980
in this book.
link |
01:03:00.820
Well, some aspects, maybe.
link |
01:03:03.300
Some aspects, right, right, exactly.
link |
01:03:05.220
Well, it's a good question.
link |
01:03:07.300
How do we teach this?
link |
01:03:08.140
How do we encourage it?
link |
01:03:09.300
How do we lift it?
link |
01:03:10.140
Because so much of the software world,
link |
01:03:12.700
it's giant communities, right?
link |
01:03:15.140
But it seems like so much is moved by,
link |
01:03:16.820
like little individuals.
link |
01:03:18.180
You talk about like Linus Torvalds.
link |
01:03:21.020
It's like, could you have not,
link |
01:03:23.380
could you have had Linux without him?
link |
01:03:25.980
Could you?
link |
01:03:26.820
Yeah, Guido and Python.
link |
01:03:28.140
Guido and Python.
link |
01:03:28.980
Guido and Python.
link |
01:03:29.820
Well, the iPy community particularly,
link |
01:03:30.980
it's like I said, we wanted to build this big thing,
link |
01:03:32.820
but ultimately we didn't.
link |
01:03:33.780
What happened is we had Mavericks and champions
link |
01:03:36.060
like John Hunter who created Matplotlib.
link |
01:03:37.780
We had Fernando Perez who created iPython.
link |
01:03:39.940
And so we sort of inspired each other,
link |
01:03:42.260
but then it kind of, there's sort of a culture
link |
01:03:43.980
of this selfless giving, the stewardship mentality,
link |
01:03:47.820
as opposed to ownership mentality,
link |
01:03:49.140
but stewardship and community focused,
link |
01:03:54.040
community focused, but intentional work.
link |
01:03:56.620
Like not waiting for everybody else to do the work,
link |
01:03:58.900
but you're doing it for the benefit of others
link |
01:04:00.700
and not worried about what you're gonna get.
link |
01:04:04.020
You're not worried about the credit.
link |
01:04:04.860
You're not worried about what you're gonna get.
link |
01:04:05.860
You're worried about, I later realized
link |
01:04:07.580
that I have to worry a little about credit,
link |
01:04:09.000
not because I want the credit,
link |
01:04:10.300
because I want people to understand
link |
01:04:11.380
what led to the results.
link |
01:04:13.020
Like, I don't, it's not about me.
link |
01:04:15.060
It's I want to understand this is what led to the result.
link |
01:04:17.540
So let's like, I think doing,
link |
01:04:18.980
and this is what had no impact on the result.
link |
01:04:21.100
Like let's promote, just like you said,
link |
01:04:23.420
I want to promote the attributes
link |
01:04:25.100
that help make us better off.
link |
01:04:26.520
How do we make more of West McKinney?
link |
01:04:28.820
Like West McKinney was critical to the success of Python
link |
01:04:31.620
because of his creation of pandas,
link |
01:04:33.420
which is the roots of that were all the way back
link |
01:04:36.420
in numeric and num array and numpy,
link |
01:04:40.260
where numpy created an array of records.
link |
01:04:43.180
West started to use that almost like a data frame,
link |
01:04:45.980
except it's an array of records.
link |
01:04:47.840
And data frame, the challenge is,
link |
01:04:49.780
okay, if you want to augment it at another column,
link |
01:04:52.240
you have to insert, you have to do all this memory movement
link |
01:04:54.700
to insert a column.
link |
01:04:55.660
Whereas data frames became,
link |
01:04:57.180
oh, I'm going to have a loose collection of arrays.
link |
01:05:00.460
So it's a record of arrays that is a part of a data frame.
link |
01:05:03.980
And we thought about that back in the memory days,
link |
01:05:05.560
but West ended up doing the work to build it.
link |
01:05:08.940
And then also the operations that were relevant
link |
01:05:11.300
for data processing.
link |
01:05:12.620
What I noticed is just that each of these little things
link |
01:05:15.220
creates just another tick, another up.
link |
01:05:17.380
So numpy ultimately took a little while,
link |
01:05:19.940
about six months in, people started to join me,
link |
01:05:22.700
Francesc Altad, Robert Kern, Charles Harris.
link |
01:05:27.300
And these people are many of the unsung heroes, I would say.
link |
01:05:30.300
People who are, you know,
link |
01:05:31.980
they sometimes don't get the credit they deserve
link |
01:05:34.100
because they were critical both to support,
link |
01:05:36.540
like, you know, it's hard and you want,
link |
01:05:38.260
you need some support, people need support.
link |
01:05:40.340
And I needed just encouragement.
link |
01:05:41.580
And they were helping and encouraged by contributing.
link |
01:05:43.860
And once, the big thing for me was when John Hunter,
link |
01:05:48.240
he had previously done kind of a simple thing
link |
01:05:50.180
called numerics to kind of, you know, between numeric
link |
01:05:52.820
and numerae, he had a little high level tool
link |
01:05:55.100
that would just select each one for matplotlib.
link |
01:05:57.900
In 2006, he finally said,
link |
01:06:00.420
we're gonna just make numpy the dependency of matplotlib.
link |
01:06:03.220
As soon as he did that,
link |
01:06:04.420
and I remember specifically when he did that,
link |
01:06:06.100
I said, okay, we've done it.
link |
01:06:07.900
Like, that was when I knew we had to see success.
link |
01:06:11.260
Before then it was still unsure,
link |
01:06:13.620
but that kind of started a roller coaster.
link |
01:06:15.060
And then 2006 to 2009.
link |
01:06:17.900
And then I've been floored by what it's done.
link |
01:06:20.940
Like, I knew it would help.
link |
01:06:22.900
I had no idea how much it would help.
link |
01:06:25.380
Right, so.
link |
01:06:26.300
And it has to do with, again, the language thing.
link |
01:06:28.660
It just, people started to think in terms of numpy.
link |
01:06:31.940
Yes.
link |
01:06:32.820
And that opened up a whole new way of thinking.
link |
01:06:36.460
And part of the story that you kind of mentioned,
link |
01:06:39.220
but maybe you can elaborate,
link |
01:06:42.980
is it seems like at some point in the story,
link |
01:06:46.320
Python took over science and data science.
link |
01:06:50.800
Yes.
link |
01:06:51.640
And bigger than that,
link |
01:06:54.800
the scientific community started to think like programmers
link |
01:07:00.160
or started to utilize the tools of computers to do,
link |
01:07:04.280
like at a scale that wasn't done with Fortran.
link |
01:07:06.640
Like at this gigantic scale,
link |
01:07:09.320
they started to open in their heart.
link |
01:07:10.760
And then Python was the thing.
link |
01:07:12.040
I mean, there's a few other competitors, I guess,
link |
01:07:14.280
but Python, I think, really, really took over.
link |
01:07:16.960
I agree.
link |
01:07:17.800
There's a lot of stories here
link |
01:07:18.620
that are kind of during this journey,
link |
01:07:19.720
because this is sort of the start of this journey in 2005, 2006.
link |
01:07:23.240
So my tenure committee, I applied for tenure in 2006, 2007.
link |
01:07:28.180
It came back, I split the department.
link |
01:07:29.780
I was very polarizing.
link |
01:07:31.300
I had some huge fans
link |
01:07:32.560
and then some people that said no way, right?
link |
01:07:34.380
So it was very, I was a polarizing figure in the department.
link |
01:07:36.840
It went all the way up to the university president.
link |
01:07:39.800
Ultimately, my department chair had the sway
link |
01:07:42.760
and they didn't say no.
link |
01:07:43.760
They said, come back in two years and do it again.
link |
01:07:46.360
And I went, eh, at that point, I was like,
link |
01:07:49.680
I mean, I had this interest in entrepreneurship,
link |
01:07:52.840
this interest in not the academic circles,
link |
01:07:56.400
not the, like, how do we make industry work?
link |
01:07:59.680
So I do have to give credit to that exploration of economics
link |
01:08:03.060
because that led me, oh, I had a lot of opinions.
link |
01:08:06.540
I was actually very libertarian at the time.
link |
01:08:09.520
And I still have some libertarian trends,
link |
01:08:11.840
but I'm more of a, I'm more of a collectivist libertarian.
link |
01:08:15.880
So you value broadly, philosophically freedom.
link |
01:08:18.720
I value broadly, philosophically freedom,
link |
01:08:20.360
but I also understand the power of communities,
link |
01:08:23.440
like the power of collective behavior.
link |
01:08:26.200
And so what's that balance, right?
link |
01:08:27.840
That makes sense.
link |
01:08:29.800
So by the time I was just,
link |
01:08:31.520
I gotta go out and explore this entrepreneur world.
link |
01:08:33.380
So I left academia.
link |
01:08:34.220
I said, no thanks, called my friend, Eric, here,
link |
01:08:37.820
who had, his company was going.
link |
01:08:39.560
I said, hey, could I join you and start this trend?
link |
01:08:43.120
And he, at that time they were using SciFi a lot.
link |
01:08:45.920
They were trying to get clients.
link |
01:08:47.120
And so I came down to Texas.
link |
01:08:48.760
And in Texas is where I sort of,
link |
01:08:51.160
it's my entrepreneur world, right?
link |
01:08:53.440
I left academia and went to entrepreneur world in 2007.
link |
01:08:57.360
So I moved here in 2007, kind of took a leap,
link |
01:08:59.920
knew nothing really about business,
link |
01:09:01.600
knew nothing about a lot of stuff there.
link |
01:09:05.100
There's, you know, for a long time,
link |
01:09:06.980
I've kept some connections to a lot of academics
link |
01:09:08.980
because I still value it.
link |
01:09:10.080
I still love the scientific tradition.
link |
01:09:12.520
I still value the essence and the soul and the heart
link |
01:09:15.240
of what is possible.
link |
01:09:17.320
Don't like a lot of the administration
link |
01:09:21.380
and the kind of, we can go into detail about why
link |
01:09:24.160
and where and how this happens,
link |
01:09:25.320
what are some of the challenges.
link |
01:09:26.520
I don't know, but I'm with you.
link |
01:09:28.480
So I'm still affiliated with MIT.
link |
01:09:31.840
I still love MIT because there's magic there.
link |
01:09:35.600
There's people I talk to, like researchers, faculty,
link |
01:09:40.320
in those conversations and the whiteboard
link |
01:09:43.120
and just the conversation, that's magic there.
link |
01:09:46.220
All the other stuff, the administration,
link |
01:09:48.120
all that kind of stuff seems to,
link |
01:09:52.020
you don't wanna say too harshly criticize
link |
01:09:54.920
sort of bureaucracies, but there's a lag
link |
01:09:57.680
that seems to get in the way of the magic.
link |
01:10:00.800
And I'm still have a lot of hope
link |
01:10:03.800
that that can change because I don't often see
link |
01:10:08.320
that particular type of magic elsewhere in the industry.
link |
01:10:12.840
So like we need that and we need that flame going.
link |
01:10:15.800
And it's the same thing as exactly as you said,
link |
01:10:19.120
it has the same kind of elements
link |
01:10:20.560
like the open source community does.
link |
01:10:23.240
And, but then if you, like the reason I stepped away,
link |
01:10:27.160
the reason I'm here, just like you did in Austin is like,
link |
01:10:30.260
if I wanna build one robot, I'll stay at MIT.
link |
01:10:33.240
But if I wanna build millions and make money enough
link |
01:10:37.460
to where I can explore the magic of that, then you can't.
link |
01:10:41.000
And I think that dance is...
link |
01:10:44.160
That translational dance has been lost a bit, right?
link |
01:10:47.480
And there's a lot of reasons for that.
link |
01:10:48.640
I'm not, I'm certainly not an expert on this stuff.
link |
01:10:50.160
I can opine like anybody else,
link |
01:10:51.660
but I realized that I wanted to explore entrepreneurship,
link |
01:10:55.820
which I, and really figure out,
link |
01:10:57.720
and it's been a driving passion for 20 years, 25 years.
link |
01:11:01.560
How do we connect capital markets and company?
link |
01:11:06.480
Cause again, I fell in love with the notion of,
link |
01:11:07.880
oh, profit seeking on its own is not a bad thing.
link |
01:11:11.160
It's actually a coordination mechanism
link |
01:11:13.520
for allocating resources that, you know,
link |
01:11:16.480
in an emergent way, right?
link |
01:11:18.000
That respects everybody's opinions, right?
link |
01:11:20.720
So this is actually powerful.
link |
01:11:21.880
So I say all the time, when I make a company
link |
01:11:25.320
and we do something that makes profit,
link |
01:11:27.260
what we're saying is, hey,
link |
01:11:28.100
we're collecting of the world's resources
link |
01:11:29.800
and voluntarily people are asking us
link |
01:11:31.480
to do something that they like.
link |
01:11:33.000
And that's a huge deal.
link |
01:11:34.000
And so I really liked that energy.
link |
01:11:36.120
So that's what I came to do and to learn
link |
01:11:37.560
and to try to figure out.
link |
01:11:38.480
And that's what I've been kind of stumbling through
link |
01:11:40.120
since for the past 14 years.
link |
01:11:40.960
And that's 2007.
link |
01:11:42.580
2007, yeah.
link |
01:11:43.420
And so you were still working at NoPi.
link |
01:11:44.960
So NoPi was just emerging.
link |
01:11:46.560
Just emerging.
link |
01:11:47.400
One of the things I've done,
link |
01:11:49.160
it's worth mentioning because it emphasizes
link |
01:11:51.480
the exploratory nature of my thinking at the time.
link |
01:11:53.840
I said, well, I don't know how to fund this thing.
link |
01:11:55.240
I've got a graduate student I'm paying for
link |
01:11:56.720
and I've got no funding for him.
link |
01:11:57.880
And I had done some fundraising from the public
link |
01:12:00.520
to try to get public fundraisers in my lab.
link |
01:12:02.800
I didn't really wanna go out
link |
01:12:03.880
and just do the fundraising circuit
link |
01:12:05.360
the way it's traditionally done.
link |
01:12:06.920
So I wrote a book and I said, I'm gonna write a book
link |
01:12:09.960
and I'm gonna charge for it.
link |
01:12:11.440
It was called Guide to NoPi.
link |
01:12:12.720
And so ultimately NoPi became
link |
01:12:14.040
documentation driven development
link |
01:12:15.960
because I basically wrote the book
link |
01:12:17.280
and made sure the stuff worked or the book would work.
link |
01:12:19.760
So it really helped actually make NoPi become a thing.
link |
01:12:23.040
So writing that book,
link |
01:12:25.800
and it's not a page turner.
link |
01:12:28.200
Guide to NoPi is not a book you pick up
link |
01:12:29.680
and go, oh, this is great, over the fire.
link |
01:12:31.520
But it's where you could find the details,
link |
01:12:33.640
like how'd all this work.
link |
01:12:34.720
And a lot of people love that book.
link |
01:12:36.520
And so a lot of people ended up,
link |
01:12:38.040
so I said, look, I need to, so I'm gonna charge for it.
link |
01:12:41.600
And I got some flack for that.
link |
01:12:42.760
Not that much, just probably five angry messages,
link |
01:12:45.920
people yelling at me saying I was a bad guy
link |
01:12:49.960
for charging for this book.
link |
01:12:51.360
Was one of them Richard Stallman?
link |
01:12:53.280
No. Just kidding.
link |
01:12:54.120
No, I haven't really had any interaction with him personally,
link |
01:12:56.920
like I said, but there were a few,
link |
01:12:59.840
but actually surprisingly not.
link |
01:13:01.280
There was actually a lot of people like,
link |
01:13:02.760
no, it's fine, you can charge for a book.
link |
01:13:04.240
That's no big deal.
link |
01:13:05.080
We know that's a way you can try to make money
link |
01:13:07.080
around open source.
link |
01:13:07.920
So what I did, I did it in an interesting way.
link |
01:13:10.160
I said, well, kind of my ideas around IP law and stuff.
link |
01:13:14.280
I love the idea you can share something, you can spread it.
link |
01:13:16.120
Like once it's, the fact that you have a thing
link |
01:13:18.280
and copying is free, but the creation is not free.
link |
01:13:21.640
So how do you fund the creation and allow the copying?
link |
01:13:25.600
And in software, it's a little more complicated than that
link |
01:13:27.040
because creation is actually a continuous thing.
link |
01:13:29.360
It's not like you build a widget and it's done.
link |
01:13:31.160
It's sort of a process of emerging
link |
01:13:32.640
and continuing to create.
link |
01:13:34.560
But I wrote the book
link |
01:13:35.520
and had this market determined price thing.
link |
01:13:37.520
I said, look, I need, I think I said 250,000.
link |
01:13:40.720
If I make 250,000 from this book, I'll make it free.
link |
01:13:44.280
So as soon as I get that much money,
link |
01:13:45.760
or I said five years, so there's a time limit.
link |
01:13:48.960
Like it's not forever.
link |
01:13:49.800
That's really cool.
link |
01:13:50.640
It's amazing.
link |
01:13:51.680
I released it on this.
link |
01:13:53.080
And it's actually interesting
link |
01:13:54.240
because one of the people
link |
01:13:55.800
who also thought that was interesting
link |
01:13:57.040
ended up being Chris White,
link |
01:13:58.600
who was the director of DARPA project
link |
01:14:01.360
that we got funding through at Anaconda.
link |
01:14:02.920
And the reason he even called us back
link |
01:14:04.640
is because he remembered my name from this book
link |
01:14:06.720
and he thought that was interesting.
link |
01:14:08.080
And so even though we hadn't gone to the demo days,
link |
01:14:10.880
we applied and the people said, yeah,
link |
01:14:12.680
nobody ever gets this without coming to the demo day first.
link |
01:14:15.360
This is the first time I've seen it.
link |
01:14:16.320
But it's because I knew, you know,
link |
01:14:18.200
Chris had done this and had this interaction.
link |
01:14:19.640
So it did have impact.
link |
01:14:21.680
I was actually really, really pleased by the result.
link |
01:14:23.880
I mean, I ended up in three years, I made 90,000.
link |
01:14:27.360
So sold 30,000 copies by myself.
link |
01:14:29.480
I just put it up on, you know, use PayPal and sold it.
link |
01:14:33.000
And that was my first taste of kind of, okay,
link |
01:14:36.040
this can work to some degree.
link |
01:14:37.600
And I, you know, all over the world, right?
link |
01:14:40.320
From Germany to Japan to, it was actually, it did work.
link |
01:14:44.480
And so I appreciated the fact that PayPal existed
link |
01:14:47.040
and I had a way to get the money, the distribution was simple.
link |
01:14:51.200
This is pre Amazon book stuff.
link |
01:14:53.480
So it was just publishing a website.
link |
01:14:55.320
It was the popularity of SciPy emerging
link |
01:14:57.120
and getting company usage.
link |
01:14:58.960
I ended up not letting it go the five years
link |
01:15:00.600
and not trying to make the full amount
link |
01:15:01.960
because, you know, a year and a half later,
link |
01:15:04.560
I was at Enthought.
link |
01:15:05.400
I had left academia as an Enthought
link |
01:15:06.680
and I kind of had a full time job.
link |
01:15:07.880
And then actually what happened is the documentation people,
link |
01:15:10.000
there's a group that said, hey,
link |
01:15:10.840
we want to do documentation for SciPy as a collective.
link |
01:15:14.280
And they're essentially needing the stuff in the book, right?
link |
01:15:18.680
And so they kind of ask,
link |
01:15:20.360
hey, could we just use the stuff in your book?
link |
01:15:21.920
And at that point I said, yeah, I'll just open it up.
link |
01:15:24.160
So that's, but it has served its purpose.
link |
01:15:27.320
And the money that I made actually funded my grad student.
link |
01:15:31.040
Like it was actually, you know,
link |
01:15:32.160
I paid him 25,000 a year out of that money.
link |
01:15:35.440
So the funny thing is if you do a very similar
link |
01:15:37.440
kind of experiment now with NumPy or something like it,
link |
01:15:40.680
you could probably make a lot more.
link |
01:15:42.480
It's probably true.
link |
01:15:43.800
Because of the tooling and the community building.
link |
01:15:46.360
Yeah, I agree.
link |
01:15:47.200
Like the, and social media,
link |
01:15:48.680
that there's just a virality to that kind of idea.
link |
01:15:51.560
I agree.
link |
01:15:52.400
There'd be things to do.
link |
01:15:53.240
I've thought about that.
link |
01:15:54.080
And really I thought about a couple of books
link |
01:15:56.080
or a couple of things that could be done there.
link |
01:15:57.440
And I just haven't, right?
link |
01:15:58.960
Even, I tried to hire a ghostwriter this year too
link |
01:16:01.920
to see if that could help, but it didn't.
link |
01:16:04.160
But part of my problem is this,
link |
01:16:06.240
I've been so excited by a number of things
link |
01:16:08.080
that have stemmed from that.
link |
01:16:09.480
Like, so I came here, worked at Enthought for four years,
link |
01:16:13.040
graciously, Eric made me president.
link |
01:16:14.960
Then we started to work closely together.
link |
01:16:16.280
We actually helped him buy out his partner.
link |
01:16:19.440
It didn't end great.
link |
01:16:20.720
Like unfortunately Eric and I aren't real,
link |
01:16:22.880
aren't friends now.
link |
01:16:24.560
I still respect him.
link |
01:16:25.400
I have a lot, I wish we were,
link |
01:16:26.640
but he didn't like the fact that Peter and I
link |
01:16:30.240
started Anaconda, right?
link |
01:16:31.680
That was not, I mean, so there's two sides to that story.
link |
01:16:36.200
So I'm not gonna go into it, right?
link |
01:16:37.360
Sure.
link |
01:16:38.200
But you, as human beings
link |
01:16:40.600
and you wish you still could be friends.
link |
01:16:42.320
I do, I do.
link |
01:16:43.920
It saddens me.
link |
01:16:45.160
I mean, that's a story of great minds
link |
01:16:49.040
building great companies.
link |
01:16:51.480
Somehow it's sad that when there's that kind of.
link |
01:16:55.000
And I hold him in esteem.
link |
01:16:57.360
I'm grateful for him.
link |
01:16:58.200
I think Enthought still exists.
link |
01:17:00.320
They're doing great work helping scientists.
link |
01:17:02.520
They still run the SciPy conference.
link |
01:17:05.040
They have an R&D platform they're selling now
link |
01:17:07.320
that's a tool that you can go get today, right?
link |
01:17:10.080
So Enthought has played a role in the SciPy
link |
01:17:14.920
in supporting the community around SciPy, I would say.
link |
01:17:18.240
They ended up not being able to,
link |
01:17:20.560
they ended up building a tool suite
link |
01:17:22.040
to write GUI applications.
link |
01:17:24.040
Like that's where they could actually make
link |
01:17:25.440
that the business could work.
link |
01:17:26.680
And so supporting SciPy and NumPy itself
link |
01:17:29.480
wasn't as possible.
link |
01:17:30.560
Like they didn't, they tried.
link |
01:17:31.960
I mean, it was not just because,
link |
01:17:33.280
it was just because of the business aspect.
link |
01:17:34.480
So, and I wanted to build a company that could do,
link |
01:17:36.840
that could get venture funding, right?
link |
01:17:39.080
Better for worse.
link |
01:17:39.920
I mean, that's a longer story.
link |
01:17:41.040
We could talk a lot about that, but.
link |
01:17:42.400
And that's where Anaconda came to be.
link |
01:17:44.200
That's where Anaconda came to be.
link |
01:17:45.040
So let me ask you, it's a little bit for fun
link |
01:17:48.040
because you built this amazing thing.
link |
01:17:50.000
And so let's talk about like an old warrior
link |
01:17:54.640
looking over old battles.
link |
01:17:57.320
You've, you know, there's a sad letter in 2012
link |
01:18:01.480
that you wrote to the NumPy mailing list
link |
01:18:04.360
announcing that you're leaving NumPy.
link |
01:18:06.320
And some of the things you've listed
link |
01:18:08.560
as some of the things you regret
link |
01:18:10.720
or not regret necessarily, but some things to think about.
link |
01:18:14.440
If you could go back and you could fix stuff about NumPy
link |
01:18:17.640
or both sort of in a personal level,
link |
01:18:20.640
but also like looking forward,
link |
01:18:21.960
what kind of things would you like to see changed?
link |
01:18:24.560
Good question.
link |
01:18:25.400
So I think there's technical questions
link |
01:18:26.320
and social questions right there.
link |
01:18:29.680
First of all, you know, I wrote NumPy as a service
link |
01:18:33.400
and I spent a lot of time doing it.
link |
01:18:35.000
And then other people came help make it happen.
link |
01:18:36.760
NumPy succeeded because the work of a lot of people, right?
link |
01:18:39.840
So it's important to understand that.
link |
01:18:42.240
I'm grateful for the opportunity,
link |
01:18:43.880
the role I had, I could play
link |
01:18:45.080
and grateful that things I did had an impact,
link |
01:18:47.600
but they only had the impact they had
link |
01:18:49.200
because the other people that came to the story.
link |
01:18:52.200
And so they were essential,
link |
01:18:53.440
but the way data types were handled,
link |
01:18:55.720
the way data types, we had array scalers, for example,
link |
01:18:59.280
that are really just a substitute for a type concept, right?
link |
01:19:04.080
So we had array scalers or actual Python objects
link |
01:19:06.960
so that there's for every, for a 32 bit float
link |
01:19:09.520
or a 16 bit float or a 16 bit integer,
link |
01:19:13.160
Python doesn't have a natural,
link |
01:19:14.720
it's just one integer, there's one float.
link |
01:19:17.040
Well, what about these lower precision types,
link |
01:19:19.960
these larger precision types?
link |
01:19:21.600
So we had them in NumPy
link |
01:19:23.680
so that you could have a collection of them,
link |
01:19:25.320
but then have an object in Python that was one of them.
link |
01:19:28.760
And there's questions about like in retrospect,
link |
01:19:31.440
I wouldn't have created those
link |
01:19:32.920
if it improved the type system.
link |
01:19:34.880
And like made the type system actually a Python type system
link |
01:19:38.000
as opposed to currently,
link |
01:19:39.480
it's a Python one level type system.
link |
01:19:41.400
I don't know if you know the difference
link |
01:19:42.240
between Python one, Python two,
link |
01:19:43.200
it's kind of technical, kind of depth,
link |
01:19:44.880
but Python two, one of its big things that Guido did,
link |
01:19:47.320
it was really brilliant.
link |
01:19:48.160
It was the actually Python one,
link |
01:19:51.640
all classes, new objects were one.
link |
01:19:55.040
If you as a user wrote a class,
link |
01:19:56.880
it was an instance of a single Python type
link |
01:19:59.600
called the class type, right?
link |
01:20:02.000
In Python two, he used a meta typing hook
link |
01:20:06.240
to actually go, oh, we can extend this
link |
01:20:07.960
and have users write classes that are new types.
link |
01:20:10.960
So he was able to have your user classes be actual types
link |
01:20:13.320
and the Python type system got a lot more rich.
link |
01:20:16.480
I barely understood that at the time that NumPy was written.
link |
01:20:19.160
And so I essentially in NumPy created a type system
link |
01:20:22.480
that was Python one era.
link |
01:20:24.400
It was every D type is an instance of the same type
link |
01:20:29.240
as opposed to having new D types be really just Python types
link |
01:20:33.160
with additional metadata.
link |
01:20:34.280
What's the cost of that?
link |
01:20:35.440
Is it efficiency, is it usability?
link |
01:20:37.200
It's usability primarily.
link |
01:20:38.840
The cost isn't really efficiency.
link |
01:20:40.320
It's the fact that it's clumsy to create new types.
link |
01:20:45.080
It's hard.
link |
01:20:45.920
And then one of the challenges,
link |
01:20:47.560
you wanna create new types.
link |
01:20:48.400
You wanna quaternion type or you wanna add a new posit type
link |
01:20:52.600
or you wanna, so it's hard.
link |
01:20:55.080
And now, if we had done that well,
link |
01:20:59.200
when Numba came on the scene
link |
01:21:00.440
where we could actually compile Python code,
link |
01:21:02.880
it would integrate with that type system much cleaner.
link |
01:21:05.160
And now all of a sudden you could do gradual typing
link |
01:21:08.080
more easily.
link |
01:21:08.920
You could actually have Python when you add Numba
link |
01:21:10.560
plus better typing, could actually be a,
link |
01:21:14.720
you'd smooth out a lot of rough edges.
link |
01:21:16.800
But there's already, there's like,
link |
01:21:18.840
but are you talking about from the perspective
link |
01:21:20.960
of developers within NumPy or users of NumPy?
link |
01:21:23.840
Developers of new, not really users of NumPy so much.
link |
01:21:27.080
It's the development of NumPy.
link |
01:21:28.800
So you're thinking about like how to design NumPy
link |
01:21:32.160
so that it's contributors.
link |
01:21:33.880
Yeah, the contributors, it's easier.
link |
01:21:35.880
It's easier.
link |
01:21:36.720
It's less work to make it better and to keep it maintained.
link |
01:21:39.320
And where that's impacted things, for example,
link |
01:21:41.480
is the GPU.
link |
01:21:43.400
Like all of a sudden GPUs start getting added
link |
01:21:45.520
and we don't have them in NumPy.
link |
01:21:48.360
Like NumPy should just work on GPUs.
link |
01:21:50.560
The fact that we'd have to download a whole other object
link |
01:21:52.680
called Kupy to have arrays on GPUs
link |
01:21:54.800
is just an artifact of history.
link |
01:21:57.440
Like there's no fundamental reason for it.
link |
01:21:59.160
Well, that's really interesting.
link |
01:22:00.200
If we could sort of go on that tangent briefly
link |
01:22:02.520
is you have PyTorch and other libraries like TensorFlow
link |
01:22:07.800
that basically tried to mimic NumPy.
link |
01:22:11.840
Like you've created a sort of platonic form
link |
01:22:15.720
of multi dimension. Basically, yeah.
link |
01:22:16.920
Yeah, exactly.
link |
01:22:17.760
Well, and the problem was I didn't realize that.
link |
01:22:19.800
Platonic form has a lot of edges.
link |
01:22:21.760
They're like, well, we should cut those out
link |
01:22:23.360
before we present it.
link |
01:22:24.200
So I wonder if you can comment,
link |
01:22:26.920
is there like a difference between their implementations?
link |
01:22:29.360
Do you wish that they were all using NumPy
link |
01:22:31.440
or like in this abstraction of GPU?
link |
01:22:34.040
And sorry to interrupt that there's GPUs, ASICs.
link |
01:22:38.240
There might be other neuromorphic computing.
link |
01:22:40.040
There might be other kind of,
link |
01:22:41.600
or the aliens will come with a new kind of computer.
link |
01:22:43.920
Like an abstraction that NumPy should just operate nicely
link |
01:22:47.880
over the things that are more and more
link |
01:22:50.280
and smarter and smarter with this multi dimensional arrays.
link |
01:22:54.200
Yeah, yeah.
link |
01:22:55.520
There's several comments there.
link |
01:22:56.920
We are working on something now called data dash APIs.org.
link |
01:23:00.360
Data dash API.org, you can go there today.
link |
01:23:02.560
And it's our answer.
link |
01:23:04.480
It's my answer.
link |
01:23:05.320
It's not just me.
link |
01:23:06.160
It's me and Rolf and Athen and Aaron
link |
01:23:09.120
and a lot of companies are helping us at Quansight Labs.
link |
01:23:13.120
It's not unifying all the arrays.
link |
01:23:14.560
It's creating an API that is unified.
link |
01:23:17.200
So we do care about this
link |
01:23:19.360
and we're trying to work through it.
link |
01:23:21.280
I actually had the chance to go and meet
link |
01:23:22.560
with the TensorFlow team and the PyTorch team
link |
01:23:25.360
and talk to them after exiting Anaconda.
link |
01:23:29.120
Just talking about,
link |
01:23:29.960
because the first year after leaving Anaconda in 2018,
link |
01:23:33.960
I became deeply aware of this and realized that,
link |
01:23:36.000
oh, this split in the array community that exists today
link |
01:23:38.960
makes what I was concerned about in 2005 pretty parochial.
link |
01:23:44.160
It's a lot worse, right?
link |
01:23:45.880
Now there's a lot more people.
link |
01:23:47.280
So perhaps the industry can sustain more stacks, right?
link |
01:23:51.400
There's a lot of money,
link |
01:23:52.560
but it makes it a lot less efficient.
link |
01:23:54.120
I mean, but I've also learned to appreciate,
link |
01:23:56.720
it's okay to have some competition.
link |
01:23:58.440
It's okay to have different implementations,
link |
01:24:00.760
but it's better if you can at least refactor some parts.
link |
01:24:03.560
I mean, you're gonna be more efficient
link |
01:24:04.960
if you can refactor parts.
link |
01:24:07.000
It's nice to have competition over things,
link |
01:24:09.560
over what is nice to have competition.
link |
01:24:11.760
They're innovative.
link |
01:24:12.600
Yeah, innovative.
link |
01:24:13.440
And then maybe on the infrastructure,
link |
01:24:15.920
whatever, however you define infrastructure,
link |
01:24:18.120
that maybe it's nice to have come together.
link |
01:24:21.400
Exactly, I agree.
link |
01:24:22.440
And I think, but it was interesting to hear the stories.
link |
01:24:24.600
I mean, TensorFlow came out of a C++ library,
link |
01:24:29.040
Jeff Dean wrote, I think,
link |
01:24:30.160
that was basically how they were doing inference, right?
link |
01:24:33.560
And then they realized, oh,
link |
01:24:34.400
we could do this TensorFlow thing.
link |
01:24:36.440
That C++ library, then what was interesting to me
link |
01:24:38.400
was the fact that both Google and Facebook did not,
link |
01:24:42.600
it's not like they supported Python or NumPy initially.
link |
01:24:44.960
They just realized they had to.
link |
01:24:47.200
They came to this world and then all the users were like,
link |
01:24:48.760
hey, where's the NumPy interface?
link |
01:24:50.680
Oh, and then they kind of came late to it
link |
01:24:52.560
and then they had these bolt ons.
link |
01:24:54.800
TensorFlow's bolt on, I don't mean to offend,
link |
01:24:57.280
but it was so bad.
link |
01:24:58.480
Yeah, it was bad.
link |
01:24:59.320
It's the first time that I'm usually,
link |
01:25:01.760
I mean, one of the challenges I have
link |
01:25:04.160
is I don't criticize enough in the sense
link |
01:25:07.000
that I don't give people input enough, you know, if.
link |
01:25:09.960
I think it's universally agreed upon
link |
01:25:11.680
that the bolt ons on TensorFlow were.
link |
01:25:13.640
But I went to, it was a talk given at Mallorca in Spain
link |
01:25:17.080
and a great guy came and gave a talk and I said,
link |
01:25:19.880
you should never show that API again
link |
01:25:21.400
at a PyData conference.
link |
01:25:23.040
Like that was, that's terrible.
link |
01:25:24.840
Like you're taking this beautiful system we've created
link |
01:25:27.080
and like you're corrupting all these poor Python people,
link |
01:25:29.440
forcing them to write code like that
link |
01:25:30.840
or thinking they should.
link |
01:25:32.640
Fortunately, you know, they adopted Keras as their,
link |
01:25:35.640
and Keras is better.
link |
01:25:36.760
And so Keras, TensorFlow is fine, is reasonable,
link |
01:25:40.360
but they bolted it on.
link |
01:25:42.680
Facebook did too.
link |
01:25:43.640
Like Facebook had their own C++ library for doing inference
link |
01:25:48.160
and they also had the same reaction, they had to do this.
link |
01:25:51.160
One big difference is Facebook,
link |
01:25:52.840
maybe because of the way it's situated in part of fair,
link |
01:25:55.240
part of the research library,
link |
01:25:56.600
TensorFlow is definitely used and, you know,
link |
01:25:58.880
they have to make, they couldn't just open it up
link |
01:26:00.720
and let the community, you know, change what that is.
link |
01:26:03.160
Cause I guess they were worried
link |
01:26:04.640
about disrupting their operations.
link |
01:26:06.880
Facebook's been much more open to having community input
link |
01:26:10.720
on the structure itself.
link |
01:26:12.400
Whereas Google and TensorFlow,
link |
01:26:14.240
they're really eager to have community users,
link |
01:26:16.000
people use it and build the infrastructure,
link |
01:26:17.520
but it's much more walled.
link |
01:26:18.840
Like it's harder to become a contributor to TensorFlow.
link |
01:26:21.600
And it's also, this is very difficult question to answer
link |
01:26:24.760
and don't mean to be throwing shade at anybody,
link |
01:26:27.080
but you have to wonder, it's the Microsoft question
link |
01:26:30.320
of when you have a tool like PyTorch or TensorFlow,
link |
01:26:33.920
how much are you tending to the hackers
link |
01:26:36.320
and how much are you tending to the big corporate clients?
link |
01:26:39.240
Correct.
link |
01:26:40.080
So like the ones that,
link |
01:26:42.560
do you tend to the millions of people
link |
01:26:44.160
that are giving you almost no money,
link |
01:26:46.440
or do you tend to the few
link |
01:26:48.360
that are giving you a ton of money?
link |
01:26:50.320
I tend to stand with the people.
link |
01:26:54.000
Right.
link |
01:26:54.840
Cause I feel like if you nurture the hackers,
link |
01:26:57.760
you will make the right decisions in the longterm
link |
01:27:00.200
that will make the companies happy.
link |
01:27:02.000
I lean that way too.
link |
01:27:03.280
I totally agree.
link |
01:27:04.120
But then you have to find the right dance.
link |
01:27:05.680
But it's a balance.
link |
01:27:07.080
Cause you can lean to the hackers and run out of money.
link |
01:27:08.960
Yeah, exactly.
link |
01:27:10.240
Exactly.
link |
01:27:11.440
Which has been some of the challenge I've faced
link |
01:27:13.760
in the sense that,
link |
01:27:14.680
like I would look at some of the experiments,
link |
01:27:17.040
like NumPy, the fact that we have this split
link |
01:27:19.040
is a factor of I wasn't able to collect more money
link |
01:27:21.720
towards NumPy development.
link |
01:27:22.800
Yeah.
link |
01:27:23.640
Right?
link |
01:27:24.480
I mean, I didn't succeed in the early days
link |
01:27:26.480
of getting enough financial contribution to NumPy
link |
01:27:29.560
so that they could work on it.
link |
01:27:31.080
Right?
link |
01:27:31.920
I couldn't work on it full time.
link |
01:27:32.760
I had to just catch an hour here, an hour there.
link |
01:27:35.640
And I basically not liked that.
link |
01:27:37.880
Like I've wanted to be able to do something about that
link |
01:27:39.920
for a long time and try to figure out how,
link |
01:27:41.440
well, there's lots of ways.
link |
01:27:42.960
I mean, possibly one could say,
link |
01:27:44.640
we had an offer from Microsoft
link |
01:27:46.240
at early days of Anaconda.
link |
01:27:48.240
2014, they offered to come buy us, right?
link |
01:27:51.160
The problem was the right people at Microsoft
link |
01:27:52.760
didn't offer to buy us.
link |
01:27:53.600
And they were still,
link |
01:27:54.880
they were, it was really a,
link |
01:27:56.440
we were like a second,
link |
01:27:58.040
they had really bought, they just bought R,
link |
01:27:59.680
the R company called,
link |
01:28:01.800
it was not R studio,
link |
01:28:02.800
but it was another R company that was emergent.
link |
01:28:05.680
And it was kind of a,
link |
01:28:07.160
well, we should also get a Python play,
link |
01:28:09.360
but they were really doubling down on R.
link |
01:28:11.520
Right?
link |
01:28:12.360
And so it was like,
link |
01:28:13.200
it was where you would go to die.
link |
01:28:14.400
So it's not, it wasn't,
link |
01:28:15.440
it was before Satya was there.
link |
01:28:17.160
Satya had just started.
link |
01:28:18.680
Just started.
link |
01:28:19.520
Right?
link |
01:28:20.360
And the offer was coming from someone
link |
01:28:21.800
two levels down from him.
link |
01:28:23.080
Got you.
link |
01:28:23.920
Right?
link |
01:28:24.760
And if it had come from Scott Guthrie,
link |
01:28:26.640
so I got a chance to meet Scott Guthrie,
link |
01:28:28.320
great guy, I like him.
link |
01:28:29.760
If an offer had come from him,
link |
01:28:31.560
probably would be at Microsoft right now.
link |
01:28:33.200
That'd be fascinating.
link |
01:28:34.520
That would be really nice actually,
link |
01:28:36.160
especially given what Microsoft has since done
link |
01:28:38.720
for the open source community and all those things.
link |
01:28:40.200
Yes, I think they're doing well.
link |
01:28:41.640
I really like some of the stuff they've been doing.
link |
01:28:43.720
They're still working,
link |
01:28:45.200
and they've, you know,
link |
01:28:46.040
they've hired Guido now,
link |
01:28:46.880
and they've hired a lot of Python developers.
link |
01:28:47.720
Wait, Guido's not at Microsoft?
link |
01:28:49.400
Yeah, he works at Microsoft.
link |
01:28:50.240
I need to.
link |
01:28:52.480
Which, he retired,
link |
01:28:53.600
then he came out of retirement,
link |
01:28:54.720
and he's working now.
link |
01:28:55.560
I was just talking to him,
link |
01:28:56.400
and he didn't mention this person.
link |
01:28:57.840
Well.
link |
01:28:58.680
I should investigate this further.
link |
01:29:01.280
Well.
link |
01:29:02.120
Because I know he loved Dropbox,
link |
01:29:02.960
but I wasn't sure what he was doing,
link |
01:29:04.000
who he was up to.
link |
01:29:05.160
Well, he was kind of saying he'd retire,
link |
01:29:06.560
but, and it's literally been five years
link |
01:29:09.640
since I last sat down and really talked to Guido.
link |
01:29:12.280
Right?
link |
01:29:13.640
Guido's a technology expert, right?
link |
01:29:16.000
He's a, so I came,
link |
01:29:17.480
I was excited because I'd finally figured out
link |
01:29:18.880
the type system for NumPy.
link |
01:29:20.720
I wanted to kind of talk about that with him,
link |
01:29:22.240
and I kind of overwhelmed him.
link |
01:29:23.960
Could you stay in that,
link |
01:29:25.080
just for a brief moment,
link |
01:29:26.640
because you're a fascinating person
link |
01:29:28.200
in the history of programming.
link |
01:29:29.440
He is a fascinating person.
link |
01:29:31.240
What have you learned from Guido
link |
01:29:34.200
about programming, about life?
link |
01:29:37.560
Yeah, yeah.
link |
01:29:38.400
A lot, actually.
link |
01:29:39.240
I've been a fan of Guido's.
link |
01:29:40.840
You know, we have a chance to talk.
link |
01:29:42.520
Some, I wouldn't say, you know,
link |
01:29:43.760
we talk all the time.
link |
01:29:44.840
Not at all.
link |
01:29:45.680
He may, but we talk enough to,
link |
01:29:47.520
I respect his,
link |
01:29:48.840
in fact, when I first started NumPy,
link |
01:29:49.880
one of the first things I did was I had a,
link |
01:29:51.520
I asked Guido for a meeting
link |
01:29:53.320
with him and Paul Dubois in San Mateo.
link |
01:29:55.400
And I went and met him for lunch.
link |
01:29:56.920
And basically, to say,
link |
01:29:58.000
maybe we can actually,
link |
01:29:59.200
part of the strategy for NumPy
link |
01:30:00.720
was to get it into Python 3,
link |
01:30:02.440
and maybe be part of Python.
link |
01:30:04.120
And so we talked about that.
link |
01:30:05.160
That's a cool conversation.
link |
01:30:06.000
And about that approach, right?
link |
01:30:06.920
I would have loved to be a flyer in the water.
link |
01:30:09.200
That was good.
link |
01:30:10.040
And over the years for Guido,
link |
01:30:12.080
I learned,
link |
01:30:13.560
so he was open.
link |
01:30:14.840
Like, he was willing to listen to people's ideas.
link |
01:30:18.200
Right?
link |
01:30:19.040
And over the years,
link |
01:30:19.880
now generally, you know,
link |
01:30:20.920
I'm not saying universally that's been true,
link |
01:30:22.600
but generally that's been true.
link |
01:30:24.360
So he's willing to listen.
link |
01:30:25.680
He's willing to defer.
link |
01:30:27.240
Like on the scientific side,
link |
01:30:28.280
he would just kind of defer.
link |
01:30:29.120
He didn't really always understand
link |
01:30:30.160
what we were doing.
link |
01:30:31.000
Yeah.
link |
01:30:31.840
And he'd defer.
link |
01:30:32.800
One place where he didn't enough
link |
01:30:35.640
was we missed a matrix multiply operator.
link |
01:30:37.680
Like that finally got added to Python,
link |
01:30:39.600
but about 10 years later than it should have.
link |
01:30:42.240
But the reason was because nobody,
link |
01:30:44.760
it takes a lot of effort.
link |
01:30:46.200
And I learned this while I was writing NumPy.
link |
01:30:48.160
I also wrote tools to Python.
link |
01:30:49.320
I began with Python Dev,
link |
01:30:50.160
and I added some pieces to Python.
link |
01:30:52.320
Like the memory view object.
link |
01:30:53.400
I wanted the structure of NumPy into Python.
link |
01:30:55.680
So we didn't get NumPy into Python,
link |
01:30:56.960
but we got the basic structure of it into Python.
link |
01:30:59.480
Like, so you could build on it.
link |
01:31:01.000
Nobody did for a while,
link |
01:31:01.880
but eventually database authors started to.
link |
01:31:04.720
And it's a lot better.
link |
01:31:05.760
They did.
link |
01:31:06.600
And also Antoine Petrou and Stefan Krah
link |
01:31:08.960
actually fixed the memory view object.
link |
01:31:10.760
Cause I wrote the underlying infrastructure in C,
link |
01:31:13.280
but the Python exposure was terrible
link |
01:31:15.520
until they came in and fixed it.
link |
01:31:16.640
Partly because I was writing NumPy,
link |
01:31:18.080
and NumPy was the Python exposure.
link |
01:31:19.960
I didn't really care about
link |
01:31:21.240
if you didn't have NumPy installed.
link |
01:31:22.800
Anyway, Guido opened up ideas,
link |
01:31:25.360
technologically brilliant.
link |
01:31:27.280
Like really, I really got a lot of respect for him
link |
01:31:29.440
when I saw what he did
link |
01:31:30.360
with this type class merger thing.
link |
01:31:33.320
It was actually tricky, right?
link |
01:31:35.200
And then willing to share, willing to share his ideas.
link |
01:31:38.400
So the other thing early on in 1998,
link |
01:31:40.200
I said, I wrote my first extension module.
link |
01:31:42.240
The reason I could is because he'd written this blog post
link |
01:31:44.800
on how to do reference counting, right?
link |
01:31:47.360
And without it, I would have been lost, right?
link |
01:31:50.040
But he was willing to at least try to write this post.
link |
01:31:53.240
And so he's been motivated early on with Python.
link |
01:31:56.080
There's a computer science for everybody.
link |
01:31:58.200
You kind of have this early on desire to,
link |
01:31:59.880
oh, maybe we should be pushing programming to more people.
link |
01:32:02.040
So he had this populist notion, I guess,
link |
01:32:04.560
or populist sense to learn that there's a certain skill,
link |
01:32:08.720
and I've seen it in other people too,
link |
01:32:10.560
of engaging with contributors sufficiently to,
link |
01:32:13.960
because when somebody engaged with you
link |
01:32:15.640
and wants to contribute to you,
link |
01:32:16.480
if you ignore them, they go away.
link |
01:32:18.400
So building that early contributor base
link |
01:32:19.760
requires real engagement with other people.
link |
01:32:23.320
And he would do that.
link |
01:32:24.520
Can you also comment on this tragic stepping down
link |
01:32:29.080
from his position as the benevolent dictator for life
link |
01:32:32.880
over the wars, you know?
link |
01:32:35.640
The Walrus operator?
link |
01:32:36.560
The Walrus operator was the last battle.
link |
01:32:39.200
I don't know if that's the cause of it,
link |
01:32:40.880
but there's this, for people who don't know,
link |
01:32:43.640
you can look up, there's the Walrus operator,
link |
01:32:45.640
which looks like a colon and equal sign.
link |
01:32:49.560
Yeah, colon, equal sign.
link |
01:32:50.800
And it actually does maybe the thing
link |
01:32:54.680
that an equal sign should be doing.
link |
01:32:57.560
Yeah, maybe, right, exactly.
link |
01:33:00.240
But it's just historically,
link |
01:33:02.080
equal sign means something else.
link |
01:33:03.560
It just means assignment.
link |
01:33:05.240
So he stepped down over this.
link |
01:33:07.280
What do you think about the pressure of leadership?
link |
01:33:10.360
It's something that, you mentioned the letter I wrote
link |
01:33:12.280
in NumPy at the time.
link |
01:33:13.640
That was a hard time, actually.
link |
01:33:15.240
I mean, there's been really hard times.
link |
01:33:17.080
It was hard.
link |
01:33:19.520
You get criticized, right?
link |
01:33:20.840
And you get pushed, and you get,
link |
01:33:22.800
not everybody loves what you do.
link |
01:33:23.800
Like anytime you do anything that has impact at all,
link |
01:33:26.880
you're not universally loved, right?
link |
01:33:28.560
You get some real critics.
link |
01:33:29.760
And that's an important energy,
link |
01:33:31.960
because it's impossible for you to do everything right.
link |
01:33:35.080
You need people to be pushing.
link |
01:33:37.160
But sometimes people can get mean, right?
link |
01:33:39.320
People can, I prefer to give people the benefit of the doubt.
link |
01:33:43.080
I don't immediately assume they have bad intentions.
link |
01:33:45.800
And maybe for other, maybe that doesn't happen for everybody.
link |
01:33:49.000
For whatever reason, their past,
link |
01:33:50.200
their experiences with people, they sometimes have bad,
link |
01:33:53.040
so they immediately attribute to you bad intentions.
link |
01:33:54.880
So you're like, where did this come from?
link |
01:33:56.080
I mean, I'm definitely open to criticism,
link |
01:33:57.760
but I think you're misinterpreting the whole point.
link |
01:34:00.520
Because I would get that, certainly when I started Anaconda.
link |
01:34:05.800
Sometimes I say to people,
link |
01:34:08.520
I care enough about entrepreneurship
link |
01:34:09.760
to make some open source people uncomfortable.
link |
01:34:12.240
And I care enough about open source
link |
01:34:13.520
to make investors uncomfortable.
link |
01:34:15.560
So I sort of, you create kind of doubters on both sides.
link |
01:34:19.880
So when you have, and this is just a plea
link |
01:34:23.840
to the listener and the public, I've noticed this too,
link |
01:34:27.680
that there's a tendency, and social media makes this worse,
link |
01:34:32.680
when you don't have perfect information about the situation,
link |
01:34:35.560
you tend to fill the gaps with the worst possible,
link |
01:34:39.280
or at least a bad story that fills those gaps.
link |
01:34:43.080
And I think it's good to live life,
link |
01:34:46.960
maybe not fully naively, but filling in the gaps
link |
01:34:49.760
with the good, with the best, with the positive,
link |
01:34:54.720
with the hopeful explanation of why you see this.
link |
01:34:57.280
So if you see somebody like you trying to make money
link |
01:35:00.280
on a book about an umpire,
link |
01:35:01.960
there's a million stories around that that are positive.
link |
01:35:04.880
And those are good to think about,
link |
01:35:07.840
to project positive intent on the people.
link |
01:35:10.600
Because for many reasons, usually because people are good
link |
01:35:13.960
and they do have good intent.
link |
01:35:15.560
And also when you project that positive intent,
link |
01:35:17.480
people will step up to that too.
link |
01:35:19.400
Yes.
link |
01:35:20.240
It's a great point.
link |
01:35:21.760
It has this kind of viral nature to it.
link |
01:35:24.320
And of course with Twitter, early on figured out,
link |
01:35:27.720
and Facebook is that they can make a lot of money
link |
01:35:30.360
and engagement from the negative.
link |
01:35:32.280
Yes.
link |
01:35:33.120
So there's this, we're fighting this mechanism.
link |
01:35:35.440
I agree.
link |
01:35:36.280
Which is challenging.
link |
01:35:37.120
It's easier.
link |
01:35:37.940
It's just easier to be.
link |
01:35:38.780
To be negative.
link |
01:35:39.620
And then for some reason, something in our minds
link |
01:35:41.920
really enjoys sharing that and getting all excited
link |
01:35:45.280
about the negativity.
link |
01:35:46.280
We do, yeah.
link |
01:35:47.400
Some protective mechanism perhaps that we're gonna get eaten
link |
01:35:50.440
if we don't, yeah.
link |
01:35:51.280
Exactly.
link |
01:35:52.100
For us to be effective as a group of people
link |
01:35:53.200
in a software engineering project,
link |
01:35:54.600
you have to project positive intent, I think.
link |
01:35:56.860
I totally agree.
link |
01:35:57.820
Totally agree.
link |
01:35:58.660
And I think that's very,
link |
01:35:59.480
and so that happens in this space.
link |
01:36:01.640
But Python has done a reasonable job in the past,
link |
01:36:03.840
but here is a situation where I think it started
link |
01:36:05.920
to get this pressure where it didn't.
link |
01:36:07.840
I really didn't, I didn't know enough about what happened.
link |
01:36:10.440
I've talked to several people about it.
link |
01:36:12.160
And I know most of the steering committee members today,
link |
01:36:15.840
one person nominated me for that role,
link |
01:36:17.880
but it's the wrong role for me right now, right?
link |
01:36:20.880
I have a lot of respect for the Python developer space
link |
01:36:24.040
and the Python developers.
link |
01:36:25.440
I also understand the gap between computer science
link |
01:36:27.600
Python developers and array programming developers
link |
01:36:30.440
or science developers.
link |
01:36:31.440
And in fact, Python succeeds in the array space
link |
01:36:34.560
the more it has people in that boundary.
link |
01:36:36.520
And there's often very few.
link |
01:36:37.960
Like I was playing a role in that boundary
link |
01:36:39.440
and working like everything to try to keep up
link |
01:36:42.600
with even what Guido was saying, like I'm a C programmer,
link |
01:36:47.720
but not a computer scientist.
link |
01:36:49.080
Like I was an engineer and physicist and mathematician,
link |
01:36:52.600
and I didn't always understand
link |
01:36:54.840
what they were talking about
link |
01:36:56.360
and why they would have opinions the way they did.
link |
01:36:58.360
So, you know, you have to listen and try to understand.
link |
01:37:00.280
Then you also have to explain your point of view
link |
01:37:02.120
in a way they can understand.
link |
01:37:03.560
And that takes a lot of work.
link |
01:37:04.840
And that communication is always the challenge.
link |
01:37:07.920
And it's just what we're describing here
link |
01:37:09.200
about the negativity is just another form of that.
link |
01:37:11.520
Like how do we come together?
link |
01:37:12.560
And it does appear we're wired anyway
link |
01:37:14.520
to at least have a, there's a part of us
link |
01:37:16.560
that will enemy, you know, friend, enemy.
link |
01:37:18.880
And we see, yeah, it's like,
link |
01:37:21.360
why are we wiring on the enemy front?
link |
01:37:23.520
So why are we pushing that?
link |
01:37:24.760
Why are we promoting that so deeply?
link |
01:37:26.680
Assume friend until proven otherwise.
link |
01:37:28.440
Yes, yes.
link |
01:37:30.000
So, cause you have such a fascinating mind in all of this.
link |
01:37:32.160
Let me just ask you these questions.
link |
01:37:34.160
So one interesting side on the Python history
link |
01:37:38.000
is the move from Python two to Python three.
link |
01:37:41.000
You mentioned move from Python one to Python two,
link |
01:37:43.720
but the move from Python two to Python three
link |
01:37:46.800
is a little bit interesting
link |
01:37:47.920
because it took a very long time.
link |
01:37:50.040
It broke, you know, quite a small way
link |
01:37:53.520
backward compatibility, but even that small way
link |
01:37:56.280
seemed to have been very painful for people.
link |
01:37:58.680
Is there lessons you draw?
link |
01:38:00.640
Oh man, tons of lessons.
link |
01:38:01.480
From how long it took and how painful it seemed to be?
link |
01:38:05.520
Yeah, tons of lessons.
link |
01:38:07.000
Well, I mentioned here earlier
link |
01:38:08.240
that NumPy was written in 2005.
link |
01:38:11.840
It was in 2005 that I actually went to Guido
link |
01:38:15.520
to talk about getting NumPy into Python three.
link |
01:38:17.240
Like my strategy was to,
link |
01:38:18.880
oh, we were moving to Python three.
link |
01:38:19.960
Let's have that be, and it seems funny in retrospect
link |
01:38:22.200
because like, wait, Python three,
link |
01:38:23.360
that was in 2020, right?
link |
01:38:25.480
When we finally ended the support for Python two
link |
01:38:27.760
or at least 2017.
link |
01:38:29.000
The reason it took a long time,
link |
01:38:30.880
a lot of time, I think it was because one of the things is
link |
01:38:33.320
there wasn't much to like about Python three.
link |
01:38:36.240
3.0, 3.1, it really wasn't until 3.3.
link |
01:38:40.280
Like I consider Python 3.3 to be Python 3.0.
link |
01:38:43.600
But it wasn't until Python 3.3
link |
01:38:44.880
that I felt there's enough stuff in it
link |
01:38:47.200
to make it worth anybody using it, right?
link |
01:38:49.800
And then 3.4 started to be, oh yeah, I want that.
link |
01:38:52.600
And then 3.5 as the matrix multiply operator,
link |
01:38:54.880
and now it's like, okay, we gotta use that.
link |
01:38:56.520
Plus the libraries that started leveraging
link |
01:38:58.400
some of the features of Python three.
link |
01:38:59.600
Exactly.
link |
01:39:00.760
So it really, the challenge was it was,
link |
01:39:03.800
but it also illustrated a truism that, you know,
link |
01:39:07.400
when you have inertia,
link |
01:39:08.240
when you have a group of people using something,
link |
01:39:10.480
it's really hard to move them away from it.
link |
01:39:11.960
You can't just change the world on them.
link |
01:39:13.920
And Python three, you know, made some,
link |
01:39:15.440
I think it fixed some things Guido had always hated.
link |
01:39:17.240
I don't think he didn't like the fact
link |
01:39:18.440
that print was a statement.
link |
01:39:19.440
He wanted to make it a function.
link |
01:39:20.760
But in some sense, that's a bit of gratuitous change
link |
01:39:23.200
to the language.
link |
01:39:24.120
And you could argue, and people have,
link |
01:39:27.320
but one of the challenges was there wasn't enough features
link |
01:39:31.520
and too many just changes without features.
link |
01:39:34.960
And so the empathy for the end user
link |
01:39:37.440
as to why they would switch wasn't there.
link |
01:39:40.480
I think also it illustrated just the funding realities.
link |
01:39:42.960
Like Python wasn't funded.
link |
01:39:45.040
Like it was also a project
link |
01:39:46.160
with a bunch of volunteer labor, right?
link |
01:39:48.280
It had more people, so more volunteer labor,
link |
01:39:50.240
but it was still, it was fun in the sense
link |
01:39:52.240
that at least Guido had a job.
link |
01:39:53.480
And I've learned some of the behind the scenes on that now
link |
01:39:55.880
since talking to people who have lived through it
link |
01:39:57.840
and maybe not on air, we can talk about some of that.
link |
01:40:00.560
But it's interesting to see, but Guido had a job,
link |
01:40:03.640
but his full time job wasn't just work on Python.
link |
01:40:07.080
Like he had other things to do.
link |
01:40:08.880
Just wild.
link |
01:40:09.880
It is wild, isn't it?
link |
01:40:10.720
It's wild how few people are funded.
link |
01:40:13.320
Yes.
link |
01:40:14.160
And how much impact they have.
link |
01:40:15.200
Yes.
link |
01:40:16.160
Maybe that's a feature not a bug, I don't know.
link |
01:40:17.920
Maybe, yes, exactly.
link |
01:40:19.080
At least early on, like it's sort of, I know, yeah.
link |
01:40:21.840
It's like Olympic athletes are often severely underfunded,
link |
01:40:25.160
but maybe that's what brings out the greatness.
link |
01:40:27.360
Perhaps, yes, correct.
link |
01:40:28.520
No, exactly.
link |
01:40:29.680
Maybe this is the essential part of it.
link |
01:40:31.880
Because I do think about that in terms of,
link |
01:40:33.680
I currently have an incubator for open source startups.
link |
01:40:36.200
Like what I'm trying to do right now
link |
01:40:37.640
is create the environment I wished had existed
link |
01:40:40.480
when I was leaving academia with NumPy
link |
01:40:42.880
and trying to figure out what to do.
link |
01:40:44.120
I'm trying to create those opportunities and environments.
link |
01:40:46.120
So, and that's what drives me still,
link |
01:40:49.320
is how do I make the world easier
link |
01:40:50.760
for the open source entrepreneur?
link |
01:40:52.600
So let me stay, I mean, I could probably stay on NumPy
link |
01:40:55.960
for a long time, but this is fun question.
link |
01:41:00.960
So Andre Kapathy leads the Tesla Autopilot team,
link |
01:41:04.680
and he's also one of the most like legit programmers I know.
link |
01:41:10.720
It's like he builds stuff from scratch a lot,
link |
01:41:13.760
and that's how he builds intuition about how a problem works.
link |
01:41:16.200
He just builds it from scratch, and I always love that.
link |
01:41:18.320
And the primary language he uses is Python
link |
01:41:21.320
for the intuition building.
link |
01:41:23.080
But he posted something on Twitter saying
link |
01:41:27.600
that they got a significant improvement
link |
01:41:31.280
on some aspect of their like data loading, I think,
link |
01:41:35.640
by switching away from np.square root,
link |
01:41:39.840
so the NumPy's implementation of square root,
link |
01:41:42.160
to math.square root, and then somebody else commented
link |
01:41:44.520
that you can get even a much greater improvement
link |
01:41:48.120
by using the vanilla Python square root, which is like.
link |
01:41:52.600
Power 0.5.
link |
01:41:53.640
Power 0.5.
link |
01:41:55.200
And it's fascinating to me, I just wanted to.
link |
01:41:58.640
So that was some shade throwing at some.
link |
01:42:02.080
No, no, and yes, we're talking about.
link |
01:42:04.640
It's a good way to ask the trade off
link |
01:42:08.080
between usability and efficiency broadly in NumPy,
link |
01:42:12.080
but also on these specific weird quirks
link |
01:42:14.920
of like a single function.
link |
01:42:16.680
Yep, so on that point, if you use a NumPy math function
link |
01:42:21.360
on a scaler, it's gonna be slower
link |
01:42:25.000
than using a Python function on that scaler.
link |
01:42:27.960
But because the math object in NumPy is more complicated,
link |
01:42:33.800
because you can also call that math object on an array.
link |
01:42:36.760
And so effectively, it goes through a similar machine.
link |
01:42:39.200
There aren't enough of the, which you would do
link |
01:42:41.840
and you could do like checks and fast paths.
link |
01:42:45.960
So yeah, if you're basically doing a list,
link |
01:42:48.800
if you run over a list, in fact,
link |
01:42:50.680
for problems that are less than 1,000,
link |
01:42:53.700
even maybe 10,000 is probably the,
link |
01:42:55.320
if you're going more than 10,000,
link |
01:42:56.900
that's where you definitely need to be using arrays.
link |
01:42:59.080
But if you're less than that, and for reading,
link |
01:43:01.200
if you're doing a reading process
link |
01:43:02.760
and essentially it's not compute bound, it's IO bound.
link |
01:43:05.600
And so you're really taking lists of 1,000 at a time
link |
01:43:08.480
and doing work on it.
link |
01:43:09.540
Yeah, you could be faster just using Python,
link |
01:43:11.680
straight up Python.
link |
01:43:12.740
See, but also, and this is the side to the top,
link |
01:43:16.640
there's the fundamental questions
link |
01:43:18.680
when you look at the long arc of history,
link |
01:43:21.240
it's very possible that np.square root is much faster.
link |
01:43:25.560
It could be.
link |
01:43:26.400
So like in terms of like, don't worry about it,
link |
01:43:29.480
it's the evils of over optimization or whatever,
link |
01:43:32.420
all the different quotes around that,
link |
01:43:34.040
is sometimes obsessing about this particular little quark
link |
01:43:39.520
is not sufficient.
link |
01:43:41.720
For somebody like, if you're trying to optimize your path,
link |
01:43:45.220
I mean, I agree, premature optimization
link |
01:43:47.680
creates all kinds of challenges, right?
link |
01:43:49.320
Because now, but you may have to do it.
link |
01:43:51.840
I believe the quote is, it's the root of all evil.
link |
01:43:53.880
It's the root of all evil, right?
link |
01:43:55.560
Let's give Donald Knuth, I think,
link |
01:43:57.040
or is he more than somebody else?
link |
01:43:59.160
Well, Doc Knuth is kind of like Mark Twain,
link |
01:44:00.800
people just attribute stuff to him, I don't know.
link |
01:44:02.880
And it's fine because he's brilliant.
link |
01:44:04.640
So, no, I was a LaTeX user myself,
link |
01:44:07.640
and so I have a lot of respect,
link |
01:44:09.280
and he did more than that, of course,
link |
01:44:10.820
but yeah, someone I really appreciate
link |
01:44:14.120
in the computer science space.
link |
01:44:15.640
Yeah, I don't, I think that's appropriate.
link |
01:44:17.080
There's a lot of little things like that,
link |
01:44:18.320
where people actually, if you understood it,
link |
01:44:20.120
you go, yeah, of course, that's the case.
link |
01:44:22.640
And the other part, the other part I didn't mention,
link |
01:44:25.040
and Numba was a thing we wrote early on,
link |
01:44:27.960
and I was really excited by Numba
link |
01:44:29.040
because it's something we wanted,
link |
01:44:30.040
it was a compiler for Python syntax,
link |
01:44:32.160
and I wanted it from the beginning of writing NumPy
link |
01:44:35.440
because of this function question,
link |
01:44:38.280
like taking, the power of arrays
link |
01:44:41.900
is really that you can write functions using all of it.
link |
01:44:45.120
It has implicit looping, right?
link |
01:44:47.000
So you don't worry about,
link |
01:44:47.840
I write this n dimensional for loop
link |
01:44:49.200
with four loops, four, four statements.
link |
01:44:51.240
You just say, oh, big four dimensional array,
link |
01:44:53.600
I'm gonna do this operation, this plus, this minus,
link |
01:44:55.760
this reduction, and you get this,
link |
01:44:57.680
it's called vectorization in other areas,
link |
01:44:59.560
but you can basically think at a high level
link |
01:45:01.440
and get massive amounts of computation done
link |
01:45:03.640
with the added benefit of,
link |
01:45:06.200
oh, it can be paralyzed easily.
link |
01:45:08.040
It can be put in parallel.
link |
01:45:09.040
You don't have to think about that.
link |
01:45:10.000
In fact, it's worse to go decompose your,
link |
01:45:12.720
you write the for loops
link |
01:45:14.160
and then try to infer parallelism from for loops.
link |
01:45:16.280
That's actually a harder problem
link |
01:45:17.600
than to take the array problem
link |
01:45:19.640
and just automatically parallelize that problem.
link |
01:45:22.040
That's what, and so functions in NumPy
link |
01:45:25.320
are called universal functions, ufuncs.
link |
01:45:27.080
So square root is an example of a ufunk.
link |
01:45:29.000
There are others, sine, cosine, add, subtract.
link |
01:45:32.400
In fact, one of the first libraries to SciPy
link |
01:45:34.520
was something called Special
link |
01:45:35.520
where I added Bessel functions
link |
01:45:36.920
and all these special functions that come up in physics
link |
01:45:40.240
and I added them as ufuncs so they could work on arrays.
link |
01:45:43.040
So I understood ufuncs very, very well
link |
01:45:44.720
from day one inside of numeric.
link |
01:45:45.960
That was one of the things we tried to make better
link |
01:45:47.320
in NumPy was how do they work?
link |
01:45:49.120
Can they do broadcasting?
link |
01:45:50.360
What does broadcasting mean?
link |
01:45:51.960
But one of the problems is, okay,
link |
01:45:54.600
what do I do with a Python scaler?
link |
01:45:57.320
So what happens, the Python scaler gets broadcast
link |
01:45:59.800
to a zero dimensional array
link |
01:46:01.320
and then it goes through the whole same machinery
link |
01:46:02.800
as if it were a 10,000 dimensional array.
link |
01:46:05.080
And then it kind of unpacks the element
link |
01:46:07.640
and then does the addition.
link |
01:46:09.880
That's not to mention the function it calls
link |
01:46:12.600
in the case of square root
link |
01:46:13.640
is just the clib square root, right?
link |
01:46:15.960
In some cases, like Python's power,
link |
01:46:18.160
there's some optimizations they're doing
link |
01:46:20.360
that could be faster
link |
01:46:21.520
than just calling this the clib square root.
link |
01:46:23.760
In the interpreter or in the?
link |
01:46:25.320
No, in the C code, in the Python runtime.
link |
01:46:27.640
In the Python runtime, so they really optimize it
link |
01:46:30.960
and they have the freedom to do that
link |
01:46:32.120
because they don't have to worry about.
link |
01:46:32.960
It's just a scaler.
link |
01:46:34.080
It's just a scaler.
link |
01:46:34.920
Right, they don't have to worry about the fact
link |
01:46:36.200
that, oh, this could be an object with many pieces.
link |
01:46:39.360
The ufunc machine is also generic
link |
01:46:41.080
in sense that typecasting and broadcasting,
link |
01:46:44.600
broadcasting's idea of I'm gonna go,
link |
01:46:46.160
I have a zero dimensional array,
link |
01:46:47.360
I have a scaler with a four dimensional array
link |
01:46:49.240
and I add them.
link |
01:46:50.480
Oh, I have to kind of coerce the shape of this guy
link |
01:46:54.640
to make it work against the whole four dimensional array.
link |
01:46:56.880
So it's the idea of I can do a one dimensional array
link |
01:46:59.680
against a two dimensional array and have it make sense.
link |
01:47:02.200
Well, that's what NumPy does is it challenges you
link |
01:47:04.040
to reformulate, rethink your problem
link |
01:47:07.040
as a multi dimensional array problem
link |
01:47:09.080
versus move away from scalers completely.
link |
01:47:12.640
Right, exactly, exactly.
link |
01:47:14.240
In fact, that's where some of the edge cases boundaries are
link |
01:47:16.680
is that, well, they're still there
link |
01:47:18.960
and this is where array scalers are particular.
link |
01:47:21.080
So array scalers are particularly bad
link |
01:47:23.120
in the sense that they were written
link |
01:47:24.360
so that you could optimize the math on them,
link |
01:47:26.840
but that hasn't happened.
link |
01:47:29.040
And so their default is to coerce the array scaler
link |
01:47:32.800
to a zero dimensional array
link |
01:47:33.760
and then use the NumPy machinery.
link |
01:47:36.000
That's what, and you could specialize,
link |
01:47:38.200
but it doesn't happen all the time.
link |
01:47:39.960
So in fact, when we first wrote Numba,
link |
01:47:41.760
we do comparisons and say, look, it's 1000X speed up.
link |
01:47:45.720
We were lying a little bit in the sense that,
link |
01:47:47.160
well, first do the 40X slowdown
link |
01:47:50.240
of using the array scalers inside of a loop.
link |
01:47:52.280
Cause if you used to use Python scalers,
link |
01:47:53.560
you'd already be 10 times faster.
link |
01:47:56.200
But then we would get a hundred times faster
link |
01:47:58.080
over that using just compilation.
link |
01:48:00.320
But what we do is compile the loop
link |
01:48:01.600
from out of the interpreter to machine code.
link |
01:48:04.000
And then that's always been the power of Python
link |
01:48:06.280
is this extensibility so that you can,
link |
01:48:08.280
cause people say, oh, Python's so slow.
link |
01:48:09.680
Well, sure, if you do all your logic
link |
01:48:11.520
in the runtime of the Python interpreter, yeah.
link |
01:48:13.920
But the power is that you don't have to.
link |
01:48:15.800
You write all the logic,
link |
01:48:17.260
what you do in the high level is just high level logic.
link |
01:48:19.860
And the actual calls you're making
link |
01:48:21.920
could be on gigabyte arrays of data.
link |
01:48:24.400
And that's all done at compiled speeds.
link |
01:48:26.880
And the fact that integration is one can happen,
link |
01:48:30.320
but two is separable.
link |
01:48:32.420
That's one of the, the language like Julia says,
link |
01:48:35.240
we're going to be all in one.
link |
01:48:36.380
You can do all of it together.
link |
01:48:37.400
And then there's, the jury's out, is that possible?
link |
01:48:39.880
I tend to think that you're going to,
link |
01:48:41.760
there's separate concerns there.
link |
01:48:43.280
You want to precompile.
link |
01:48:44.320
In fact, generally you will want to precompile your,
link |
01:48:47.560
some of your loops.
link |
01:48:48.400
Like SciPy is a compilation step.
link |
01:48:50.160
To install SciPy, it takes about two hours.
link |
01:48:53.240
If you have many machines,
link |
01:48:54.080
maybe you can get it down to one hour.
link |
01:48:55.440
But to compile those libraries takes about, takes a while.
link |
01:48:57.920
You don't want to do that at runtime.
link |
01:48:59.920
You don't want to do that all the time.
link |
01:49:00.800
You want to have this precompiled binary available
link |
01:49:02.720
that you're then just linking into.
link |
01:49:04.400
So there's real questions about the whole source code.
link |
01:49:09.040
Code is, running binary code is more than source code.
link |
01:49:11.840
It's creating object code, it's the linker, it's the loader,
link |
01:49:14.480
it's the how does that interpret it
link |
01:49:15.600
inside of virtual memory space.
link |
01:49:17.640
There's a lot of details there that actually
link |
01:49:19.160
I didn't understand for a long time
link |
01:49:20.520
until I read books on the topic.
link |
01:49:23.000
And it led to, the more you know, the better off you are
link |
01:49:27.060
and you can do more details,
link |
01:49:28.440
but sometimes it helps with abstractions too.
link |
01:49:31.280
Well, the problem, as we mentioned earlier
link |
01:49:33.480
with abstractions is you kind of sometimes assume
link |
01:49:37.700
that whoever implemented this thing
link |
01:49:41.520
had your case in mind and found the optimal solution.
link |
01:49:45.000
Yes.
link |
01:49:45.840
Or like you assume certain things.
link |
01:49:47.320
I mean, there's a lot of,
link |
01:49:48.160
Correct.
link |
01:49:49.000
One of the really powerful things to me early on,
link |
01:49:52.800
I mean, it sounds silly to say, but with Python,
link |
01:49:55.480
probably one of the reasons I fell in love with it
link |
01:49:58.440
is dictionaries.
link |
01:49:59.800
Yes.
link |
01:50:00.920
So obviously probably most languages
link |
01:50:03.680
have some mapping concept,
link |
01:50:06.440
but it felt like it was a first class citizen
link |
01:50:09.040
and it was just my brain was able to think in dictionaries.
link |
01:50:12.200
But then there's the thing that I guess I still use
link |
01:50:14.640
to this day is order dictionaries
link |
01:50:16.920
because that seems like a more natural way
link |
01:50:20.120
to construct dictionaries.
link |
01:50:21.680
Yeah.
link |
01:50:22.520
And from a computer science perspective,
link |
01:50:23.720
the running time cost is not that significant,
link |
01:50:26.000
but there's a lot of things to understand about dictionaries
link |
01:50:30.400
that the abstraction kind of
link |
01:50:33.800
doesn't necessarily incentivize you to understand.
link |
01:50:37.400
Right, do you really understand the notion of a hash map
link |
01:50:39.400
and how the dictionary is implemented?
link |
01:50:41.080
But you're right.
link |
01:50:42.080
Dictionaries are a good example
link |
01:50:43.440
of an abstraction that's powerful.
link |
01:50:44.920
And I agree with you.
link |
01:50:46.000
I agree, I love dictionaries too.
link |
01:50:47.800
Took me a while to understand that once you do,
link |
01:50:49.160
you realize, oh, they're everywhere.
link |
01:50:50.280
And Python uses them everywhere too.
link |
01:50:52.760
Like it's actually constructed,
link |
01:50:54.240
one of the foundational things is dictionaries
link |
01:50:55.760
and it does everything with dictionaries.
link |
01:50:57.560
So it is, it's powerful.
link |
01:50:58.600
Order dictionaries came later,
link |
01:51:00.160
but it is very, very powerful.
link |
01:51:02.200
It took me a little while coming
link |
01:51:03.400
from just the array programming entirely
link |
01:51:05.960
to understand these other objects,
link |
01:51:07.360
like dictionaries and lists and tuples and binary trees.
link |
01:51:11.600
Like I said, I wasn't a computer scientist,
link |
01:51:13.360
I studied arrays first.
link |
01:51:15.120
And so I was very array centric.
link |
01:51:16.800
And you realize, oh, these others
link |
01:51:17.960
don't have purposes and value actually.
link |
01:51:21.200
I agree.
link |
01:51:22.040
There's a friendliness about,
link |
01:51:24.320
like one way to think about arrays
link |
01:51:26.760
is arrays are just like full of numbers,
link |
01:51:31.920
but to make them accessible to humans
link |
01:51:35.000
and make them less error prone to human users,
link |
01:51:38.700
sometimes you want to attach names,
link |
01:51:41.480
human interpretable names
link |
01:51:43.120
that are sticky to those arrays.
link |
01:51:44.720
So that's how you start to think about dictionaries
link |
01:51:47.160
is you start to convert numbers
link |
01:51:50.520
into something that's human interpretable.
link |
01:51:52.120
And that's actually the tension I've had with NumPy
link |
01:51:55.320
because I've built so much tooling
link |
01:51:58.160
around human interpretability
link |
01:52:02.320
and also protecting me from a year later
link |
01:52:05.680
not making the mistakes by being,
link |
01:52:07.960
I wanted to force myself to use English versus numbers.
link |
01:52:12.880
Yes, so there's a project called Labeled Arrays.
link |
01:52:15.680
Like very early it was recognized that,
link |
01:52:18.040
oh, we're indexing NumPy with just numbers,
link |
01:52:21.320
all the columns and particularly the dimensions.
link |
01:52:23.640
I mean, if you have an image,
link |
01:52:25.520
you don't necessarily need to label each column or row,
link |
01:52:27.680
but if you have a lot of images
link |
01:52:29.160
or you have another dimension,
link |
01:52:30.440
you'd at least like to label the dimension
link |
01:52:31.640
as this is X, this is Y, this is Z,
link |
01:52:33.120
or this is give us some human meaning
link |
01:52:34.640
or some domain specific meaning.
link |
01:52:36.760
That was one of the impetuses for Pandas actually
link |
01:52:39.680
was just, oh, we do need to label these things.
link |
01:52:43.040
And Label Array was an attempt to add
link |
01:52:45.240
that like a lighter weight version of that.
link |
01:52:47.680
And there's been, like, that's an example of something
link |
01:52:49.360
I think NumPy could add, could be added to NumPy,
link |
01:52:53.080
but one of the challenges again, how do you fund this?
link |
01:52:55.000
Like I said, one of the tragedies I think is that,
link |
01:52:58.280
so I never had the chance to,
link |
01:53:00.240
I was never paid to work on NumPy, right?
link |
01:53:02.360
So I've always just done it in my spare time,
link |
01:53:04.400
always taken from one thing,
link |
01:53:05.880
taken from another thing to do it.
link |
01:53:07.920
And at the time, I mean, today,
link |
01:53:09.800
it would be the wrong day and today,
link |
01:53:11.000
like paying me to work on NumPy now
link |
01:53:12.160
would not be a good use of effort,
link |
01:53:13.480
but we are finally at Quansight Labs,
link |
01:53:16.640
I'm actually paying people to work on NumPy and SciPy,
link |
01:53:19.440
which is I'm thrilled with, I'm excited by.
link |
01:53:22.000
I've wanted to do that.
link |
01:53:22.840
That's what I always wanted to do from day one.
link |
01:53:24.280
It just took me a while to figure out a mechanism to do that.
link |
01:53:27.640
Even like in the university setting,
link |
01:53:29.680
respecting that, like pushing students,
link |
01:53:33.840
young minds and young graduate students to contribute
link |
01:53:38.000
and then figuring out financial mechanisms
link |
01:53:41.160
that enable them to contribute
link |
01:53:43.280
and then sort of reward them
link |
01:53:45.280
for their innovative scientific journey,
link |
01:53:48.000
that would be nice.
link |
01:53:49.160
But then also just a better allocation of resources.
link |
01:53:53.360
It's 20 year anniversary since 9.11
link |
01:53:55.760
and I was just looking, we spent over $6 trillion
link |
01:53:59.240
in the Middle East after 9.11 in the various efforts there.
link |
01:54:04.560
And sort of to put politics and all that aside,
link |
01:54:08.040
it's just, you think about the education system,
link |
01:54:10.120
all the other ways we could have
link |
01:54:11.320
possibly allocated that money.
link |
01:54:14.280
To me, to take it back,
link |
01:54:16.560
the amount of impact you would have
link |
01:54:21.200
by allocating a little bit of money to the programmers
link |
01:54:26.360
that build the tools that run the world is fascinating.
link |
01:54:30.600
It is.
link |
01:54:32.600
I don't know, I think, again,
link |
01:54:34.920
there is some aspect to being broke
link |
01:54:38.040
as somewhat of a feature, not a bug,
link |
01:54:40.240
that you make sure that you're valued.
link |
01:54:42.320
But you can still manage that.
link |
01:54:43.440
Right, no, I know.
link |
01:54:45.320
But I don't think that's a big part.
link |
01:54:47.040
So it's like, I think you can have enough money
link |
01:54:50.720
and actually be wealthy while maintaining your values.
link |
01:54:53.880
Agreed, agreed.
link |
01:54:55.520
There's an old adage that nations that trade together
link |
01:54:57.800
don't go to war together.
link |
01:54:59.440
I've often thought about nations that code together.
link |
01:55:01.680
Yeah, code together.
link |
01:55:02.520
Right?
link |
01:55:03.360
I love that.
link |
01:55:04.200
Because one of the things I love about open source
link |
01:55:05.360
is it's global, it's multinational.
link |
01:55:07.880
Like there aren't national boundaries.
link |
01:55:09.160
One of the challenges with business and open source
link |
01:55:10.760
is the fact that, well, business is national.
link |
01:55:12.800
Like businesses are entities
link |
01:55:13.960
that are recognized in legal jurisdictions, right?
link |
01:55:16.240
And have laws that are respected in those jurisdictions
link |
01:55:18.280
and hiring, and yet the open source ecosystem
link |
01:55:21.320
is not, it's not there.
link |
01:55:23.040
Like currently, one of the problems we're solving
link |
01:55:25.080
is hiring people all over the world, right?
link |
01:55:27.200
Because we, it's a global effort.
link |
01:55:29.600
And I've had the chance to work, and I've loved the chance.
link |
01:55:31.920
I've never been to like Iran,
link |
01:55:35.280
but I once had a conference
link |
01:55:36.800
where I was able to talk to people there, right?
link |
01:55:38.640
And talk to folks in Pakistan.
link |
01:55:40.920
I've never been there, but we had a call
link |
01:55:44.080
where there were people there,
link |
01:55:45.320
like just scientists and normal people.
link |
01:55:47.600
And there's a certain amount of humanizing, right?
link |
01:55:52.640
That gets away from the,
link |
01:55:54.360
like we often get the memes of society
link |
01:55:56.200
that bubble up and get discussed,
link |
01:55:58.560
but the memes are not even an accurate reflection
link |
01:56:00.760
of the reality of what people are.
link |
01:56:02.400
Well, if you look at the major power centers
link |
01:56:05.440
that are leading to something like cyber war
link |
01:56:08.240
in the next few decades,
link |
01:56:10.000
it's the United States, it's Russia, and China.
link |
01:56:13.320
And those three countries in particular
link |
01:56:16.080
have incredible developers.
link |
01:56:18.240
So if they work together, I think that's one way,
link |
01:56:21.360
the politicians can do their stupid bickering,
link |
01:56:23.360
but like there's a layer of infrastructure, of humanity.
link |
01:56:27.360
If they collaborate together,
link |
01:56:29.400
that I think can prevent major military conflict,
link |
01:56:34.080
which would, I think most likely happen at the cyber level
link |
01:56:37.840
versus the actual hot war level.
link |
01:56:39.800
You're right.
link |
01:56:40.640
You know, I think that's a good prediction.
link |
01:56:43.320
Nations that code together don't go to war together.
link |
01:56:46.560
Don't go to war together.
link |
0