back to indexTravis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224
link |
The following is a conversation with Travis Oliphant,
link |
one of the most impactful programmers
link |
and data scientists ever.
link |
He created NumPy, SciPy, and Anaconda.
link |
NumPy formed the foundation
link |
of tensor based machine learning in Python,
link |
SciPy formed the foundation
link |
of scientific programming in Python,
link |
and Anaconda, specifically with Conda,
link |
made Python more accessible to a much larger audience.
link |
Travis's life work across a large number of programming
link |
and entrepreneurial efforts has and will continue
link |
to have immeasurable impact on millions of lives
link |
by empowering scientists and engineers
link |
in big companies, small companies,
link |
and open source communities to take on difficult problems
link |
and solve them with the power of programming.
link |
Plus, he's a truly kind human being,
link |
which is something that when combined with vision
link |
and ambition makes for a great leader
link |
and a great person to chat with.
link |
To support this podcast,
link |
please check out our sponsors in the description.
link |
This is the Lex Friedman Podcast,
link |
and here is my conversation with Travis Oliphant.
link |
What was the first computer program you've ever written?
link |
Whoa, that's a good question.
link |
I think it was in fourth grade.
link |
Just a simple loop in BASIC.
link |
BASIC. BASIC, yeah, on an Atari 800,
link |
Atari 400, I think, or maybe it was an Atari 800.
link |
It was a part of a class,
link |
and we just were just BASIC loops to print things out.
link |
Did you use go to statements?
link |
Yes, yes, we used go to statements.
link |
I remember in the early days,
link |
that's when I first realized
link |
there's like principles to programming,
link |
when I was told that don't use go to statements.
link |
Those are bad software engineering principles,
link |
like it goes against what great, beautiful code is.
link |
I was like, oh, okay, there's rules to this game.
link |
I didn't see that until high school
link |
when I took an AP computer science course.
link |
I did a lot of other kinds of just programming in TI,
link |
but finally, when I took an AP computer science course
link |
That's, yeah, it was Pascal.
link |
That's when I, oh, there are these principles.
link |
No, I didn't take C until the next year in college.
link |
I had a course in C, but I haven't done much in Pascal,
link |
just that AP computer science course.
link |
Now, sorry for the romanticized question,
link |
but when did you first fall in love with programming?
link |
Oh, man, good question.
link |
I think actually when I was 10,
link |
my dad got us a TI Timex Sinclair,
link |
and he was excited about the spreadsheet capability,
link |
and then, but I made him get the basic,
link |
the add ons we could actually program in basic,
link |
and just being able to write instructions
link |
and have the computer do something.
link |
Then we got a TI 994A when I was about 12,
link |
and I would just, it had sprites and graphics and music.
link |
You could actually program it to do music.
link |
That's when I really sort of fell in love with programming.
link |
So this is a full, like a real computer
link |
with like, with memory and storage,
link |
processors and whatnot,
link |
because you say TI. Yeah, the Timex Sinclair
link |
was one of the very first, it was a cheap, cheap,
link |
like, I think it was, well, it was still expensive,
link |
but it was 2K of memory.
link |
We got the 16K add on pack,
link |
but yeah, it had memory, and you could program it.
link |
You had the, in order to store your programs,
link |
you had to attach a tape drive.
link |
Remember that old, the sound that would play
link |
when you converted the modems would convert digital bits
link |
to audio files set on a tape drive.
link |
Still remember that sound, but that was the storage.
link |
And what was the programming language, do you remember?
link |
It was basic. It was basic.
link |
And then they had a VisiCalc,
link |
and so a little bit of spreadsheet programming
link |
in VisiCalc, but mostly just some basic.
link |
Do you remember what kind of things drew you to programming?
link |
Was it working with data, was it video games?
link |
Games, math, mathy stuff?
link |
Yeah, I've always loved math,
link |
and a lot of people think they don't like math
link |
because I think when they're exposed to it early,
link |
it's about memory.
link |
When you're exposed to math early,
link |
you have a good short term memory,
link |
can remember his timetables.
link |
And I do have a reasonably, I mean, not perfect,
link |
but a reasonably long little short term memory buffer.
link |
And so I did great at timetables.
link |
I said, oh, I'm good at math.
link |
But I started to really like math,
link |
just the problem solving aspect.
link |
And so computing was problem solving applied.
link |
And so that's always kind of been the draw,
link |
kind of coupled with the mathematics.
link |
Did you ever see the computer as like an extension
link |
of your mind, like something able to achieve?
link |
It's just like a little set of puzzles
link |
that you can play with and you can play with math puzzles.
link |
Yeah, it was too rudimentary early on.
link |
Like it was sort of, yeah, it was a lot of work
link |
to actually take a thought you'd have
link |
and actually get it implemented.
link |
And that's still work, but it's getting easier.
link |
And so yeah, I would say that's definitely
link |
what's attracting me to Python
link |
is that that was more real, right?
link |
I could think in Python.
link |
Speaking of foreign language,
link |
I only speak another language fluently besides English,
link |
And I remember the day when I would dream in Spanish
link |
and you start to think in that language.
link |
And then you actually, I do definitely believe
link |
that language limits or expands your thinking.
link |
There's some languages that actually lead you
link |
to certain thought processes.
link |
Yeah, like, so I speak Russian fluently
link |
and that's certainly a language that leads you
link |
down certain thought processes.
link |
Well, yeah, I mean, there's a history
link |
of the two world wars of millions of people starving
link |
to death or near to death throughout its history
link |
of suffering, of injustice, like this promise sold
link |
to the people and then the carpet
link |
or whatever is swept from under them.
link |
And it's like broken promises.
link |
And all of that pain and melancholy is in the language,
link |
the sad songs, the sad hopeful songs,
link |
the over romanticized, like, I love you, I hate you,
link |
the sort of the swings between all the various spectrums
link |
of emotion, so that's all within the language.
link |
The way it's twisted, there's a strong culture
link |
of rhyming poetry, so like the bards,
link |
like the sync, there's a musicality to the language too.
link |
Did Dostoevsky write in Russian?
link |
Yeah, so like Dostoevsky, Tostoy, all the,
link |
The ones that I know about, which are translated
link |
and I'm curious how the translations.
link |
So Dostoevsky did not use the musicality
link |
of the language too much.
link |
So it actually translates pretty well
link |
because it's so philosophically dense
link |
that the story does a lot of the work,
link |
but there's a bunch of things that are untranslatable.
link |
Certainly the poetry is not translatable.
link |
I actually have a few conversations coming up offline
link |
and also in this podcast with people
link |
who've translated Dostoevsky.
link |
And that's for people who worked, who work in this field,
link |
know how difficult that is.
link |
Sometimes you can spend months thinking
link |
about a single sentence, right?
link |
In context, like, cause there's just the magic
link |
captured by that sentence and how do you translate
link |
just in the right way?
link |
Because those words can be really powerful.
link |
There's a famous line,
link |
beauty will save the world from Dostoevsky.
link |
You know, there's so many ways to translate that.
link |
And you're right, the language gives you the tools
link |
with which to tell the story,
link |
but it also leads your mind down certain trajectories
link |
and paths to where over time,
link |
as you think in that language,
link |
you become a different human being.
link |
Yeah, that's a fascinating reality, I think.
link |
I know people have explored that,
link |
but it's just rediscovered.
link |
Well, we don't, we live in our own like little pockets.
link |
Like this is the sad thing is I feel like unfortunately,
link |
given time and given getting older,
link |
I'll never know China, the Chinese world,
link |
because I don't truly know the language.
link |
Same with Japanese, I don't truly know Japanese
link |
and Portuguese and Brazil,
link |
that whole South American continent.
link |
Like, yeah, I'll go to Brazil and Argentina,
link |
but will I truly understand the people
link |
if I don't understand the language?
link |
It's sad because I wonder how much,
link |
how many geniuses were missing
link |
because so much of the scientific world,
link |
so much of the technical world is in English,
link |
and so much of it might be lost
link |
because it's just we don't have the common language.
link |
I completely agree.
link |
I'm very much in that vein of there's a lot of genius
link |
out there that we miss,
link |
and it's sort of fortunate when it bubbles up
link |
into something that we can understand or process,
link |
there's a lot we miss.
link |
So I tend to lean towards really loving democratization
link |
or things that empower people
link |
or very resistant sort of authoritarian structures.
link |
Fundamentally for that reason,
link |
well, several reasons, but it just hurts us.
link |
So speaking of languages that empower you,
link |
so Python was the first language for me
link |
that I really enjoyed thinking in, as you said.
link |
Sounds like you shared my experience too.
link |
So when did you first,
link |
do you remember when you first kind of connected with Python,
link |
maybe even fell in love with Python?
link |
It's a good question.
link |
It took about a year.
link |
I first encountered Python in 1997.
link |
I was a graduate student studying biomedical engineering
link |
at the Mayo Clinic.
link |
And I had previously,
link |
I'd been involved in taking information from satellites.
link |
I was an electrical engineering student
link |
used to taking information
link |
and trying to get something out of it,
link |
doing some data processing, getting information out of it.
link |
And I'd done that in MATLAB.
link |
I'd done that in Perl.
link |
I'd done that in scripting on a VMS.
link |
There's actually a VAX VMS system,
link |
they had their own little scripting tools around Fortran.
link |
Done a lot of that.
link |
And then as a graduate student,
link |
I was looking for something and encountered Python.
link |
And because Python had an array,
link |
had two things that made me not filter it away.
link |
Because I was filtering a bunch of stuff,
link |
as Yorick, I looked at Yorick,
link |
I looked at a few other languages that are out there
link |
at the time in 1997, but it had arrays.
link |
There's a library called Numeric
link |
that had just been written in 95,
link |
like not very, not too much earlier.
link |
By an MIT alum, Jim Huganen.
link |
You know, and I went back and read the mailing list
link |
to see the history of how it grew.
link |
And there was a very interesting,
link |
it's fascinating to do that actually,
link |
to see how this emergent cooperation,
link |
unstructured cooperation happens in the open source world
link |
that led to a lot of this collective programming,
link |
which is something maybe we might get into a little later,
link |
but what that looks like.
link |
What gap did Numeric fill?
link |
Numeric filled the gap of having an array object.
link |
There was no array object.
link |
There was no array.
link |
There was a one dimensional byte concept,
link |
but there was no n dimensional,
link |
two, three, four dimensional tensor they call it now.
link |
I'm still in the category that a tensor is another thing
link |
and it's just an ndarray we should call it,
link |
but kind of lost that battle.
link |
There's many battles in this world,
link |
some of which we win, some we lose.
link |
That's exactly right.
link |
So, but it had no math to it.
link |
So Numeric had math and a basic way to think in arrays.
link |
So I was looking for that,
link |
and it had complex numbers,
link |
a lot of programming languages.
link |
And you can see it because,
link |
if you're just a computer scientist,
link |
you think, ah, complex numbers are just two floats.
link |
So you can, people can build that on.
link |
But in practice, a complex number
link |
as one of the significant algebras
link |
that helps connect a lot of physical
link |
and mathematical ideas,
link |
particularly FFT for an electrical engineer.
link |
And it's a really important concept
link |
and not having it means you have to develop it
link |
several times and those times may not share an approach.
link |
One of the common things in programming,
link |
one of the things programming enables is abstractions.
link |
But when you have shared abstractions, it's even better.
link |
It sort of gets to the level of language
link |
of actually we all think of this the same way,
link |
which is both powerful and dangerous, right?
link |
Because powerful in that we now can quickly
link |
make bigger and higher level things
link |
on top of those abstractions dangerous
link |
because it also limits us as to the things
link |
we maybe left behind in producing that abstraction,
link |
which is at the heart of programming today
link |
and actually building around the programming world.
link |
I think it's a fascinating philosophical topic.
link |
Yeah, they will continue for many years, I think.
link |
They'll continue for many years.
link |
As we build more and more and more abstractions.
link |
Yes, I often think about, you know,
link |
we have a world that's built on these abstractions
link |
that were they the only ones possible?
link |
Certainly not, but they led to,
link |
you know, it's very hard to do it differently.
link |
Like there's an inertia that's very hard to,
link |
you know, push out, push away from.
link |
That has implications for things like,
link |
you know, the Julia language,
link |
which you have heard of, I'm sure.
link |
And I've met the creators and I liked Julia.
link |
It's a really cool language,
link |
but they struggled to kind of against the,
link |
just the tide of like this inertia of people using Python.
link |
And, you know, there's strategies to approach that,
link |
but nonetheless, it's a phenomena.
link |
And sometimes, so I love complex numbers
link |
and I love to raise, so I looked at Python.
link |
And then I had the experience, I did some stuff in Python
link |
and I was just doing my PhD.
link |
So I was out, my focus was on,
link |
I was actually doing a combination of MRI and ultrasound
link |
and looking at a phenomenon called elastography,
link |
which is you push waves into the body
link |
and observe those waves, like you can actually measure them.
link |
And then you do mathematical inversion
link |
to see what the elasticity is.
link |
And so that's the problem I was solving
link |
is how to do that with both ultrasound and MRI.
link |
I needed some tool to do that with.
link |
So I was starting to use Python in 97.
link |
In 98, I went back, looked at what I'd written
link |
and realized I could still understand it,
link |
which is not the experience I'd had
link |
when doing Perl in 95, right?
link |
I'd done the same thing and then I looked back
link |
and I forgotten what I was even saying.
link |
Now, you know, I'm not saying, so that may,
link |
hey, this may work, I like this.
link |
This is something I can retain
link |
without becoming an expert per se.
link |
And so that led me to go, I'm gonna push more into this.
link |
And then that 98 was kind of when I started
link |
to fall in love with Python, I would say.
link |
A few peculiar things about Python.
link |
So maybe compare it to Perl,
link |
compare it to some of the other languages.
link |
So there's no braces.
link |
So space is used, indentation, I should say,
link |
is used as part of the language.
link |
So did you, I mean, that's quite a leap.
link |
Were you comfortable with that leap
link |
or were you just very open minded?
link |
It's a good question.
link |
I was open minded, so I was cognizant of the concern.
link |
And it definitely has, it has specific challenges.
link |
You know, cut and pasting.
link |
For example, when you're cut and pasting code,
link |
and if your editors aren't supportive of that,
link |
if you're putting it into a terminal,
link |
and particularly in the past when terminals
link |
didn't necessarily have the intelligence to manage it now.
link |
Now, I, Python, and Jupyter Notebooks
link |
handle that just fine, so there's really no problem.
link |
But in the past, it created some challenges,
link |
formatting challenges, also mixed tabs and spaces.
link |
If editors weren't, you weren't clear
link |
on what was happening, you would have these issues.
link |
So there were really concrete reasons about it
link |
that I heard and understood.
link |
I never really encountered a problem with it personally.
link |
Like, it was occasional annoyances,
link |
but I really liked the fact
link |
that it didn't have all this extra characters, right?
link |
That these extra characters didn't show up
link |
in my visual field when I was just trying
link |
to process understanding a snippet of code.
link |
Yeah, there's a cleanness to it.
link |
But, I mean, the idea is supposed to be
link |
that Perl also has a cleanness to it
link |
because of the minimalism of how many characters
link |
it takes to express a certain thing.
link |
So it's very compact.
link |
But what you realize with that compactness comes,
link |
there's a culture that prizes compactness,
link |
and so the code gets more and more compact
link |
and less and less readable to a point where it's like,
link |
like, to be a good programmer in Perl,
link |
you write code that's basically unreadable.
link |
There's a culture, like.
link |
Correct, and you're proud of it.
link |
Yeah, you're proud of it.
link |
Right, exactly, and it's like, feels good.
link |
And it's really selective.
link |
It means you have to be an expert in Perl to understand it.
link |
Whereas Python allowed you not to have to be an expert.
link |
You didn't have to take all this brain energy.
link |
You could leverage, what I say,
link |
you could leverage your English language center,
link |
which you're using all the time.
link |
I've wondered about other languages,
link |
particularly non Latin based languages.
link |
Latin based languages with the characters are at least similar.
link |
I think people have an easier time,
link |
but I don't know what it's like to be a Japanese
link |
or a Chinese person trying to learn different syntax.
link |
Like, what would computer programming look like in that?
link |
I haven't looked at that at all,
link |
but it certainly doesn't,
link |
you know, leveraging your Chinese language center,
link |
I'm not sure Python or any programming does that.
link |
But that was a big deal.
link |
The fact that it was accessible, I could be a scientist.
link |
What I really liked is many programming languages
link |
really demand a lot of you, and you can get a lot,
link |
you know, you do a lot if you learn it.
link |
But Python enables you to do a lot
link |
without demanding a lot of you.
link |
There's nuance to that statement,
link |
but it certainly was, it's more accessible.
link |
So more people could actually, as a scientist,
link |
as somebody who, or an engineer,
link |
who was trying to solve another problem
link |
besides point programming,
link |
I could still use this language and get things done
link |
and be happy about it.
link |
And I was also comfortable in C at that time.
link |
And MATLAB, you did a little bit of that.
link |
And MATLAB, I did a lot before that, exactly.
link |
So I was comfortable in,
link |
those three languages were really the tools I used
link |
during my studies and schooling.
link |
But to your point about language helping you think,
link |
one of the big things about MATLAB was it was,
link |
and APL before it, I don't know if you remember APL.
link |
APL is actually the predecessor of array based programming,
link |
which I think is really an underappreciated,
link |
if I talk to people who are just steeped
link |
in computer programming, computer science,
link |
like most of the people that Microsoft has hired
link |
in the past, for example,
link |
Microsoft as a company generally did not understand
link |
array based programming.
link |
Like culturally, they didn't understand it.
link |
So they kept missing the boat,
link |
kept missing the understanding of what this was.
link |
They've gotten better,
link |
but there's still a whole culture of folks
link |
that doesn't, programming, that's systems programming
link |
or web programming or lists and maps.
link |
And what about an n dimensional array?
link |
Oh yeah, that's just an implementation detail.
link |
Well, you can think that,
link |
but then actually if you have that as a construct,
link |
you actually think differently.
link |
APL was the first language to understand that.
link |
And it was in the sixties, right?
link |
The challenge of APL is APL had very dense,
link |
not only glyphs, like new characters, new glyphs,
link |
but they even had a new keyboard
link |
because to produce those glyphs,
link |
this is back in the early days in computing
link |
when the QWERTY keyboard maybe wasn't as established,
link |
like, well, we can have a new keyboard, no big deal.
link |
But it was a big deal and it didn't catch on.
link |
And the language APL, very much like Perl,
link |
as people would pride themselves on how much,
link |
could they write the game of life
link |
in 30 characters of APL.
link |
APL has characters that mean summation
link |
and they have adverbs,
link |
they would have adjectives and these things called adverbs,
link |
which are like methods, like reduction,
link |
reduction would be an adverb on an ad operator, right?
link |
So, but doing, using these tools you could construct
link |
and then you start to think at that level,
link |
you think in n dimensions is something I like to say,
link |
and you start to think differently about data at that point.
link |
Now you're, it really helps.
link |
Yeah, I mean, outside of programming,
link |
if you really internalize linear algebra as a course,
link |
I mean, it's philosophically allows you
link |
to think of the world differently.
link |
It's almost like liberating, you don't have to,
link |
you don't have to think about the individual numbers
link |
in the n dimensional array.
link |
You could think of it as an object in itself
link |
and all of a sudden this world can open up.
link |
You're saying MATLAB and APL were like the early C,
link |
I don't know if many languages got that right ever.
link |
No, no, no they didn't.
link |
Even still, I would say.
link |
I mean, NumPy is an inheritor of the traditions
link |
that I would say APLJ was another version that was,
link |
what it did is not have the glyphs,
link |
just have short characters,
link |
but still a Latin keyboard could type them.
link |
And then numeric inherited from that
link |
in terms of let's add arrays plus broadcasting
link |
plus methods, reduction,
link |
even some of the language like rank is a concept
link |
that was in Python and is still in Python
link |
for the number of dimensions, right?
link |
That's different than say the rank of a matrix
link |
which people think of as well.
link |
So it came from that tradition,
link |
but NumPy is a very pragmatic, practical tool.
link |
NumPy inherited from numeric
link |
and we can get to where NumPy came from
link |
which is the current array,
link |
at least current as of 2015, 2017.
link |
Now there's a ton of them over the past two or three years.
link |
We can get into that too.
link |
So if we just linger on the early days
link |
of what was your favorite feature of Python?
link |
Do you remember like what?
link |
So it's so interesting to linger on like the,
link |
what really makes you connect with a language?
link |
I'm not sure it's obvious to introspect that.
link |
And I've thought about that at some length.
link |
I think definitely the fact that I could read it later,
link |
that I could use it productively
link |
without becoming an expert.
link |
Other language I had to put more effort into.
link |
That's like an empirical observation.
link |
Like you're not analyzing any one aspect of the language.
link |
It just seems time after time when you look back,
link |
it's somehow readable.
link |
It's somehow readable.
link |
Then it was sort of, I could take executable English
link |
and translate it to Python more easily.
link |
Like I didn't have to go, there was no translation layer.
link |
As an engineer or as a scientist,
link |
I could think about what I wanted to do.
link |
And then the syntax wasn't that far behind it, right?
link |
Now there are some warts there still.
link |
It wasn't perfect.
link |
Like there's some areas where I'm like,
link |
ah, it'd be better if this were different
link |
or if this were different.
link |
Some of those things got added to the language too.
link |
I was really grateful for some of the early pioneers
link |
in the Python ecosystem back,
link |
because Python got written in 91.
link |
That's when the first version came out.
link |
But Guido was very open to users.
link |
And one of the sets of users were people like Jim Huganen
link |
and David Asher and Paul Dubois and Conrad Hinson.
link |
These were people that were on the main list.
link |
And they were just asking for things like,
link |
hey, we really should have complex numbers in this language.
link |
So let's, you know, there's a J, there's a one J, right?
link |
And the fact that they went the engineering route of J
link |
I don't think that's entirely favoring engineers.
link |
I think it's because I is so often used
link |
as the index of a for loop.
link |
So I think that's actually why.
link |
Probably, I mean, there's a pragmatic aspect.
link |
But the fact that complex numbers were there, I love that.
link |
The fact that I could write in the array constructs
link |
and that reduction was there,
link |
very simple to write summations and broadcasting was there.
link |
I could do addition of whole arrays.
link |
Those are some things I loved about it.
link |
I don't know what to start talking to you about
link |
because you've created so many incredible projects
link |
that basically changed the whole landscape of programming.
link |
But okay, let's start with,
link |
let's go chronologically with SciPy.
link |
You created SciPy over two decades ago now?
link |
Yes, yes, I love to talk about SciPy.
link |
SciPy was really my baby.
link |
What was its goal?
link |
So SciPy was effectively, here I am using Python
link |
to do stuff that I previously used MATLAB to use.
link |
And I was using numeric, which is an array library
link |
that made a lot of it possible.
link |
But there's things that were missing.
link |
Like I didn't have an ordinary differential equation solver
link |
I could just call, right?
link |
I didn't have integration.
link |
Hey, I wanted to integrate this function.
link |
Okay, well, I don't have just a function
link |
I can call to do that.
link |
These are things I remember being critical things
link |
that I was missing.
link |
I just wanna pass a function to an optimizer
link |
and have it tell me what the optimal value is.
link |
Those are things I'm like, well,
link |
why don't we just write a library that adds these tools?
link |
And I started to post on the mailing list
link |
and there'd previously been, people have discussed,
link |
I remember Conrad Henson saying,
link |
wouldn't it be great if we had this optimizer library
link |
or David Ashwood say this stuff.
link |
And I'm a ambitious, ambitious is the wrong word,
link |
an eager and probably more time than sense.
link |
I was a poor graduate student.
link |
My wife thinks I'm working on my PhD and I am,
link |
but part of the PhD that I loved
link |
was the fact that it's exploratory.
link |
You're not just taking orders,
link |
fulfilling a list of things to do,
link |
you're trying to figure out what to do.
link |
And so I thought, well, I'm running tools
link |
for my own use and a PhD,
link |
so I'll just start this project.
link |
And so in 99, 98 was when I first started
link |
to write libraries for Python.
link |
Definitely when I fell in love with Python 98,
link |
I thought, oh, well, there's just a few things missing.
link |
Like, oh, I need a reader to read DICOM files.
link |
I was in medical imaging and DICOM was a format
link |
that I want to be able to load that into Python.
link |
Okay, how do I write a reader for that?
link |
So I wrote something called, it was an IO package, right?
link |
And that was my very first extension module, which is C.
link |
So I wrote C code to extend Python
link |
so that in Python I could write things more easily.
link |
That combination kind of hooked me.
link |
It was the idea that I could,
link |
here's this powerful tool I can use as a scripting language
link |
and a high level language to think about,
link |
but that I can extend easily, easily in C,
link |
easily for me because I knew enough C.
link |
And then Guido had written a link.
link |
I mean, the only, the hard part of extending Python
link |
was something called the way memory management networks,
link |
and you have to do reference counting.
link |
And so there's a tracking of reference counting
link |
you have to do manually.
link |
And if you don't, you have memory leaks.
link |
And so that's hard.
link |
Plus then C, you know, it's just much more,
link |
you have to put more effort into it.
link |
It's not just, I have to now think about pointers
link |
and I have to think about stuff that is different.
link |
I have to kind of,
link |
you're like putting a new cartridge in your brain.
link |
Like, okay, I'm thinking about MRI.
link |
Now I'm thinking about programming.
link |
And there are distinct modules
link |
you end up having to think about.
link |
And when I was just in Python,
link |
I could just think about MRI and high level writing,
link |
but I could do that.
link |
And that kind of, I liked it.
link |
I found that to be enjoyable and fun.
link |
And so I ended up, oh,
link |
well, let me just add a bunch of stuff to Python
link |
to do integration.
link |
Well, and the cool thing is,
link |
is that the power of the internet,
link |
just looking around and I found,
link |
oh, there's this NetLive,
link |
which has hundreds of 4chan routines
link |
that people have written in the 60s and the 70s and the 80s
link |
in 4chan 77, fortunately, it wasn't 4chan 16.
link |
So it had been ported to 4chan 77.
link |
And 4chan 77 is actually a really great language.
link |
4chan 90 probably is my favorite 4chan
link |
because it's also, it's got complex numbers,
link |
got arrays and it's pretty high level.
link |
Now, the problem with it
link |
is you'd never want to write a program in 4chan 90
link |
but it's totally fine to write a subroutine in, right?
link |
And so, and then 4chan kind of got a little off course
link |
when they tried to compete with C++.
link |
I just want libraries to do something like,
link |
oh, here's an ordinary differential equation.
link |
Here's integration.
link |
Here's runge cut integration.
link |
I don't have to think about that algorithm.
link |
I mean, you could,
link |
but it's nice to have somebody who's already done one
link |
And so I sort of started this journey in 98, really.
link |
If you look back at the mailing list,
link |
there's sort of this productive era of me
link |
writing an extension module
link |
to connect runge cut integration to Python
link |
and making an ordinary differential equation solver.
link |
And then releasing that as a package.
link |
So we could call ODE pack, I think I called it then.
link |
And then I just made these packages.
link |
Eventually that became multipack
link |
because they're originally modular.
link |
You can install them separately.
link |
But a massive problem in Python
link |
was actually just getting your stuff installed.
link |
At the time, releasing software for me,
link |
like today it's people think, what does that mean?
link |
Well, then it meant some poorly written webpage.
link |
I had some bad webpage up and I put a tarball,
link |
just a GZIP tarball of source code.
link |
That was the release.
link |
But okay, can we just stand that?
link |
Because the community aspect
link |
of creating the package and sharing that, that's rare.
link |
That, to have, to both have the, at that time,
link |
Yeah, it was pretty early, yeah.
link |
Oh, well, not rare.
link |
Maybe you can correct me on this,
link |
but it seems like in the scientific community,
link |
so many people, you were basically solving the problems
link |
you needed to solve to process the particular application,
link |
the data that you need.
link |
And to also have the mind
link |
that I'm going to make this usable for others, that's.
link |
I would say I was inspired.
link |
I'd been inspired by Linux,
link |
been inspired by Linus and him making his code available.
link |
And I was starting to use Linux at the time.
link |
And I went, this is cool.
link |
So I'd kind of been previously primed that way.
link |
And generally I was into science
link |
because I liked the sharing notion.
link |
I liked the idea of, hey, let's,
link |
if collectively we build knowledge and share it,
link |
we can all be better off.
link |
Okay, so you want to energize by that idea.
link |
So I was energized by that idea already, right?
link |
And I can't deny that I was.
link |
I'm sort of had this very,
link |
I liked that part of science, that part of sharing.
link |
And then all of a sudden, oh, wait, here's something.
link |
And here's something I could do.
link |
And then I slowly over years learned how to share better
link |
so that you could actually engage more people faster.
link |
One of the key things was actually giving people a binary
link |
they could install, right?
link |
So that it wasn't just your source code, good luck.
link |
Compile this and then.
link |
It's compiled, ready to install, just, you know.
link |
So in fact, a lot of the journey from 98,
link |
even through 2012 when I started Anaconda was about that.
link |
Like it's why, you know, it's really the key
link |
as to why a scientist with dreams of doing MRI research
link |
ended up starting a software company
link |
that installs software.
link |
I work with a few folks now that don't program
link |
like on the creative side and the video side,
link |
And because my whole life is running on scripts,
link |
I have to try to get them,
link |
I'm having all the task of teaching them
link |
how to do Python enough to run the scripts.
link |
And so I've been actually facing this,
link |
whether it's Anaconda or some with the task of
link |
how do I minimally explain basically to my mom
link |
how to write a Python script.
link |
And it's an interesting challenge.
link |
I have to, it's a to do item for me to figure out like,
link |
what is the minimal amount of information I have to teach?
link |
What are the tools you use that one, you enjoy it,
link |
two, you're effective at it.
link |
And they're related, those are two related questions.
link |
And then the debugging, like the iterative process
link |
of running the script to figure out what the error is,
link |
maybe even for some people to do the fix yourself.
link |
So do you compile it?
link |
Do you, like how do you distribute that code to them?
link |
And it's interesting because I think
link |
it's exactly what you're talking about.
link |
If you increase the circle of empathy,
link |
the circle of people that are able to use your programs,
link |
you increase it, it's like effectiveness and it's power.
link |
And so you have to think, can I write scripts?
link |
Can I write programs that can be used by medical engineers,
link |
by all kinds of people that don't know programming
link |
and actually maybe plant a seed,
link |
have them catch the bug of programming
link |
so that they start on a journey.
link |
That's a huge responsibility.
link |
And ultimately it has to do with the Amazon one click buy.
link |
Like how frictionless can you make the early steps?
link |
Frictionless is actually really key.
link |
To go in any community is, any friction point,
link |
you're just gonna lose some people, right?
link |
Now sometimes you may wanna intentionally do that.
link |
If you're early enough on, you need a lot of help.
link |
You need people who have the skills.
link |
You might actually, it's helpful.
link |
You don't necessarily have too many users
link |
as opposed to contributors if you're early on.
link |
Anyway, there's, SciFi started in 98,
link |
but it really emerged as this collection of modules
link |
that I was just putting on the net.
link |
People were downloading and I think I got 100 users, right?
link |
By the end of that year.
link |
But the fact that I got 100 users and more than that,
link |
people started to email me with fixes.
link |
And that was actually intoxicating, right?
link |
That was the, here I'm writing papers
link |
and I'm giving conferences and I get people to say hello,
link |
but yeah, good job.
link |
But mostly it was, you're viewed with,
link |
it's competitive, right?
link |
You publish a paper and people are like,
link |
oh, it wasn't my paper.
link |
I was starting to see that sense of academic life
link |
where it was so much,
link |
I thought there was this cooperative effort,
link |
but it sounds like we're here just to one up each other.
link |
And it's not true across the board,
link |
but a lot of that's there.
link |
But here in this world,
link |
I was getting responses from people all over the world.
link |
I remember Pjaro Peterson in Estonia, right?
link |
Was one of the first people.
link |
And he sent me back this make file,
link |
cause the first thing it is, yeah, your build thing stinks
link |
and here's a better make file.
link |
Now it was a complex make file.
link |
I don't think I never understood that make file actually,
link |
but it worked and it did a lot more.
link |
And so I said, thanks, this is cool.
link |
And that was my first kind of engagement
link |
with community development.
link |
But the process was, he sent me a patch file.
link |
I had to upload a new tar ball.
link |
And I just found, I really love that.
link |
And the style back then was here's a mailing list.
link |
It's very, it wasn't as,
link |
it's certainly weren't the tools that are available today.
link |
It was very early on, but I really started to,
link |
that's the whole year.
link |
I think I did about seven packages that year, right?
link |
And then by the end of the year,
link |
I collected them into a thing called multipack.
link |
So in 99, there was this thing called multipack.
link |
And that's when a high school student,
link |
no, he was a high school student at the time,
link |
guy named Robert Kern,
link |
took that package and made a Windows installer, right?
link |
And then of course, a massive increase of usage.
link |
So by the way, most of this development was under Linux.
link |
Yes, yes, it was on Linux.
link |
I was a Linux developer doing it on a Unix box.
link |
I mean, at the time I was actually getting into,
link |
I had a new hard drive,
link |
did some kernel programming to make the hard drive work.
link |
I mean, not programming, but modification to the kernel
link |
so I could actually get a hard drive working.
link |
I love that aspect of it.
link |
I was also in, at school, I was building a cluster.
link |
I took Mac computers and you put yellow dog Linux on them.
link |
At the Mayo Clinic, they were just,
link |
they had all these Macs that were older,
link |
they were just getting rid of.
link |
And so I kind of got permission to go grab them together.
link |
I put about 24 of them together in a cluster, in a cabinet,
link |
and put yellow dog Linux on them all.
link |
And I wrote a C++ program to do MRI simulation.
link |
That was what I was doing at the same time
link |
for my day job, so to speak.
link |
So I was loving the whole process.
link |
And the same time I was,
link |
oh, I need a ordinary differential equation.
link |
That's why ordinary differential equations were key
link |
was because that's the heart of a block equation
link |
for simulating MRI, is an ODE solver.
link |
And so that's, but I actually did that,
link |
it just happened at the same time.
link |
That's why it was kind of what you're working on
link |
and what you're interested in, they're coinciding.
link |
I was definitely scratching my own itch
link |
in terms of building stuff.
link |
And which helped in the sense that I was using it for me,
link |
so at least I had one user.
link |
I had one person who was like, well, no, this is better.
link |
I like this interface better.
link |
And I had the experience of MATLAB
link |
to guide some of what those APIs might look like.
link |
But you're just doing yourself,
link |
you're building all this stuff.
link |
But with the Windows installer,
link |
it was the first time I realized, oh yeah,
link |
the binary installer really helps people.
link |
And so that led to spending more time
link |
on that side of things.
link |
So around 2000, so I graduated my PhD in 2000,
link |
end of year, end of 2000.
link |
So 99 doing a lot of work there,
link |
98 doing a lot of work there,
link |
99 kind of spending more time on my PhD,
link |
helping people use the tools,
link |
thinking about what do I want to go from here.
link |
There was a company, there was a guy actually,
link |
Eric Jones and Travis Vought.
link |
They were two friends who founded a company called NTHOT.
link |
It's here in Austin, still here.
link |
And they, Eric contacted me at the time
link |
when I was a graduate student still.
link |
And he said, hey, why don't you come down?
link |
We want to build a company.
link |
We're thinking of a scientific company
link |
and we want to take what you're doing
link |
and kind of add it to some stuff that he'd done.
link |
He'd written some tools.
link |
And then Piero Peterson had done F2Py.
link |
Let's come together and build,
link |
pull this all together and call it SciPy.
link |
So that's the origin of the SciPy brand.
link |
It came from multi pack
link |
and a whole bunch of modules I'd written,
link |
plus a few things from some other folks
link |
and then pulled together in a single installer.
link |
SciPy was really a distribution of Python
link |
masquerading as a library.
link |
How did you think about SciPy in context of Python,
link |
in context of Numeric, like what?
link |
So we saw SciPy as a way to make an R&D environment
link |
for Python, like use Python, depended on Numeric.
link |
So Numeric was the array library we depended on.
link |
And then from there, extend it with a bunch of modules
link |
that allowed for, and at the time,
link |
the original vision of SciPy was to have plotting,
link |
was to have the REPL environment
link |
and kind of really a whole data environment
link |
that you could then install and get going with.
link |
And that was kind of the thinking.
link |
It didn't really evolve that way, right?
link |
It sort of had a, for one,
link |
it's really hard to do massive scale projects
link |
with open source collectives.
link |
Actually, there's sort of an intrinsic cooperation limit
link |
as to which, too many cooks in the kitchen,
link |
you can do amazing infrastructure work.
link |
When it comes down to bringing it all together
link |
into a single deliverable,
link |
that actually requires a little more product management
link |
that is not, that doesn't really emerge
link |
from the same dynamic.
link |
So it struggled, struggled to get almost too many voices.
link |
It's hard to have everybody agree.
link |
Consensus doesn't really work at that scale.
link |
You end up with politics,
link |
with the same kind of things that's happened
link |
in large organizations trying to decide
link |
what to do together.
link |
So consensus building was challenging at scale
link |
as more people came in, right?
link |
Early on, it's fine, because there's nobody there.
link |
So it works, but then as you get more successful
link |
and more people use it, all of a sudden,
link |
oh, there's this scale at which this doesn't work anymore
link |
and we have to come up with different approaches.
link |
So Sidepy came out officially in 2001,
link |
was the first release, most of the time.
link |
I remember the days of getting that release ready.
link |
It was a Windows installer and there were bugs
link |
on how the Windows compiler handled complex numbers
link |
and you were chasing segmentation faults.
link |
And it was, it's a lot of work.
link |
There was a lot of effort had nothing to do
link |
with my area of study.
link |
And at the same time, I had just gotten an offer.
link |
So he wondered if I wanted to come down
link |
and help him start that company with his friend.
link |
And at the time I was like, I was intrigued,
link |
but I was squaring a path, an academic path.
link |
And I had just got an offer to go and teach at my alma mater.
link |
So I took that tenure track position.
link |
And Sidepy, and kind of, then I started to work on Sidepy
link |
as a professor too.
link |
So that's, I left, I've got the Mayo Clinic,
link |
graduated, wrote my thesis using Sidepy,
link |
wrote, you know, there's images that were created.
link |
Now the plotting tool I used was something
link |
from Yorick actually.
link |
It was a plotting, a PLT kind of a plotting language
link |
Yorick is a programming language?
link |
It was a programming language, had a plotting tool,
link |
Dyslin, it had integration to Dyslin.
link |
I ended up using Dyslin plus some of the plotting
link |
from Yorick linked to from Python.
link |
Anyway, it was, people don't plot that way now,
link |
but this is before, and Sidepy was trying to add plotting.
link |
It didn't have much success.
link |
Really the success of plotting came from John Hunter,
link |
who had a similar experience to my experience,
link |
my kind of maverick experience as a person
link |
just trying to get stuff done and kind of having more time
link |
than money maybe, right?
link |
And John Hunter created what?
link |
He's the creator of MapPlotLib.
link |
Yeah, so John Hunter was, you know,
link |
he wasn't a student at the time, but he was an,
link |
he was working in Quant field and he said,
link |
we need better plotting.
link |
So he just went out and said, cool, I'll make a new project
link |
and we'll call it MapPlotLib.
link |
And he released in 2001,
link |
about the same time that Sidepy came out
link |
and it was separate library, separate install,
link |
use numeric, Sidepy use numeric.
link |
And so Sidepy, you know, in 2001, we released Sidepy
link |
and then Endthought created a conference called Sidepy,
link |
which was brought people together to talk about the space.
link |
And that conference is still ongoing.
link |
It's one of the favorite conferences of a lot of people
link |
because it's, you know, it's changed over the years,
link |
but early on it was, you know, a collection of 50 people
link |
who care about, scientists mostly, you know,
link |
practicing scientists who want, who care about coding
link |
and doing it well and not using MATLAB.
link |
And I remember being driven by, you know, I liked MATLAB,
link |
but I didn't like the fact that,
link |
so I'm not opposed to proprietary software.
link |
I'm actually not an open source zealot.
link |
I love open source for the, what it brings,
link |
but I also see the role for proprietary software.
link |
But what I didn't like was the fact that I would develop
link |
code and publish it and then effectively telling somebody
link |
here to run my code, you have to have
link |
this proprietary software.
link |
Right, and there's also culture around MATLAB as much,
link |
because I've talked to a few folks in,
link |
MathWorks creates MATLAB?
link |
I mean, there's just a culture, they try really hard,
link |
but it just, there's this corporate IBM style culture
link |
that's like, or whatever.
link |
I don't want to say negative things about IBM or whatever,
link |
No, it's really that connection.
link |
It's something I'm in the middle of right now
link |
is the business of open source.
link |
And how do you connect the ethos of cooperative development
link |
with the necessity of creating profits, right?
link |
And like right now today, I'm still in the middle of that.
link |
That's actually the early days of me exploring this question.
link |
Cause I was writing SciPy, I mean, as an aside,
link |
I also had, so I had three kids at the time.
link |
I have six kids now.
link |
I got married early, wanted a family.
link |
I had three kids and I remember reading,
link |
I read Richard Stallman's post and I was a fan of Stallman.
link |
I would read his work, I liked this collective ideas
link |
Certainly the ideas on IP law, I read a lot of his stuff.
link |
But then he said, okay, well,
link |
how do I make money with this?
link |
How do I make a living?
link |
How do I pay for my kids?
link |
All this stuff was in my mind,
link |
young graduate student making no money,
link |
thinking I got to get a job.
link |
And he said, well, I think just be like me
link |
and don't have kids, right?
link |
That's just, don't, don't.
link |
That's his take on it.
link |
That was what he said in that moment, right?
link |
That's the thing I read and I went,
link |
okay, this is a train I can't get on.
link |
There has to be a way to preserve the culture
link |
of open source and still be able to make sufficient money
link |
to feed your kids.
link |
Yes, exactly, there's gotta be.
link |
Well, so that actually led me to a study of economics.
link |
Because at the time I was ignorant and I really was.
link |
I'm actually, I'm embarrassed for educational system
link |
that they could let me and I was valedictorian
link |
in my high school class and I did super well in college.
link |
And like academically I did great, right?
link |
But the fact that I could do that and then be clueless
link |
about this key part of life,
link |
it led me to go, there's a problem.
link |
Like I should have learned this in fifth grade.
link |
I should have learned this in eighth grade.
link |
Like everybody should come out
link |
with a basic knowledge of economics.
link |
You're an interesting example because you've created tools
link |
that change the lives of probably millions of people
link |
and the fact that you don't understand at the time
link |
of the creation of those tools, the basics economics
link |
of how like to build up a giant system is the problem.
link |
Yeah, it's a problem.
link |
And so during my PhD at the same time,
link |
this is back in 98, 99 at the same time,
link |
I was in a library, I was reading books on capitalism,
link |
I was reading books on Marxism,
link |
I was reading books on what is this thing?
link |
What does it mean?
link |
And I encountered, basically I encountered a set of writings
link |
from people that said they were the inheritors of Adam Smith.
link |
Read Adam Smith for the first time, right?
link |
Which is the wealth of nations
link |
and kind of this notion of emergent societies
link |
and realized, oh, there's this whole world out here
link |
of people and the challenge of economics is also political.
link |
Like, cause economics, people, different parties
link |
running for office, they want their economic friends.
link |
They want their economists to back them up, right?
link |
Or to be their magicians, like the magicians
link |
in Pharaoh's court, right?
link |
The people that are kind of say, hey, this is,
link |
you should listen to me because I've got the expert
link |
And so it gets really muddled, right?
link |
But I was looking at it from as a scientist going,
link |
what is this space?
link |
What does this mean?
link |
How does Paris get fed?
link |
How does, what is money?
link |
And I found a lot of writings that I really loved.
link |
I found some things that I really loved
link |
and I learned from that.
link |
It was writings from people like Von Missess.
link |
He wrote a paper in 1920 that still should be read
link |
It was the economic calculation problem
link |
of the socialist commonwealth.
link |
It was basically in response
link |
to the Bolshevik revolution in 1917.
link |
And his basic argument was it's not gonna work
link |
to not have private property.
link |
You're not gonna be able to come up with prices.
link |
The bureaucrats aren't gonna be able to determine
link |
how to allocate resources without a price system.
link |
And a price system emerges from people making trades.
link |
And they can only make trades if they have authority
link |
over the thing they're trading.
link |
And that creates information flow
link |
that you just don't have if you try to top down it.
link |
And it's like, huh, that's a really good point.
link |
Yeah, the prices have a signal that's used.
link |
And it's important to have that signal
link |
when you're trying to build a community
link |
of productive people like you would
link |
in the software engineering space.
link |
Yeah, the prices are actually
link |
an important signaling mechanism.
link |
Right, and that money is just a bartering tool.
link |
Right, so this is the first time I've encountered
link |
any of this concept, right, and the fact that,
link |
oh, this is actually really critical.
link |
Like it's so critical to our prosperity
link |
and that we're dangerously not learning about this,
link |
not teaching our children about this.
link |
So you had the three kids,
link |
you had to make some hard decisions.
link |
I had to make some money, right, had to figure it out.
link |
But I didn't really care.
link |
I mean, I've never been driven by money, just need it.
link |
Yeah, right, need to eat.
link |
So how did that resolve itself in terms of site buy?
link |
So I would say it didn't really resolve itself.
link |
It sort of started a journey that I'm continuing on.
link |
I'm still on, I would say.
link |
I don't think it resolved itself.
link |
But I will say I went in eyes wide open.
link |
Like I knew that there were problems
link |
with giving stuff away and creating the market externalities
link |
that the fact that, yeah, people might use it
link |
and I might not get paid for it
link |
and I'll have to figure something else out to get paid.
link |
Like at least I can say I'm not bitter
link |
that a lot of people have used stuff that I've written
link |
and I haven't necessarily benefited economically from it.
link |
I've heard other people be bitter about that
link |
when they write or they talk.
link |
Like, oh, I should've got more value out of this.
link |
And I'm also, I want to create systems
link |
that let people like me who might have these desires
link |
to do things, let them benefit.
link |
So it actually creates more of the same.
link |
Not to turn on your bitterness module,
link |
but there's some aspect, I wish there was mechanisms for me
link |
to reward whoever created side buy and non buy
link |
because it brought so much joy to my life.
link |
I appreciate that.
link |
You know what I mean?
link |
The tip dark notion was there.
link |
I appreciate that.
link |
But there should be a very frictionless mechanism.
link |
There should be a frictionless mechanism.
link |
I would love to talk about some of the ideas I have
link |
because I actually came across,
link |
I think I've come up with some interesting notions
link |
that could work, but they'll require anything that will work
link |
takes time to emerge, right?
link |
Like things don't just turn overnight.
link |
That's definitely one thing I've also understood
link |
and learned is any fixes, that's why it's kind of funny.
link |
We often give credit to, oh, this president gets elected
link |
and oh, look how great things have done.
link |
And I saw that when I had a transition in a condo
link |
when a new CEO came in, right?
link |
And it's like the success that's happening,
link |
there's an inertia there.
link |
Yeah, and sometimes the decision you made
link |
like 10 years before is the reason why the success is the.
link |
So we're sort of just running around taking credit
link |
The credit assignment has like a delay to it
link |
that makes the credit assignment basically wrong
link |
Wrong more than right, exactly.
link |
And so I'm like, oh, this is, you know,
link |
that's the stuff I would read a ton about, you know,
link |
So I don't, I feel like I'm with you.
link |
Like I want the same thing.
link |
I want to be able to, and honestly, not for personally,
link |
I feel like I don't have any, I mean,
link |
we've been done reasonably okay, but I've had to pursue it.
link |
Like that's really what started my trajectory from academia
link |
is reading that stuff led me to say,
link |
oh, entrepreneurship matters.
link |
So I love software, but we need more entrepreneurs
link |
and I wanna understand that better.
link |
So once I kind of had that virus infect my brain,
link |
even though I was on a trajectory
link |
to go to a tenure track position at a university
link |
and I was there for six years,
link |
I was kind of already out the door when I started.
link |
And we can get into that, but.
link |
Well, can I just ask you a quick question on,
link |
is there some design principles
link |
that were in your mind around SciPy?
link |
Like, is there some key ideas
link |
that were just like sticking to you
link |
that this is the fundamental ideas?
link |
Yeah, I would say so.
link |
I would think it's basically accessibility to scientists,
link |
like give them, give scientists and engineers tools
link |
that they don't have to think a lot about programming.
link |
So give them really good building blocks,
link |
give them functions that they wanna call
link |
and sort of just the right length of spelling.
link |
There's one tradition in programming where it's like,
link |
make very, very long names, right?
link |
And you can see it in some programming languages
link |
where the names get, take half the screen.
link |
And in the 4chan world, characters had to be six letters
link |
And that's way too much, too little.
link |
But I was like, I liked to have names
link |
that were informative but short.
link |
So even though Python, well this is a different conversation,
link |
but documentation is doing some work there.
link |
So when you look at great scientific libraries
link |
and functions, there's a richness of documentation
link |
that helps you get into the details.
link |
The first glance at a function gives you the intuition
link |
of all it needs to do by looking at the headers and so on.
link |
But to get the depths of all the complexities involved,
link |
all the options involved,
link |
documentation does some of the work.
link |
Documentation is essential, yeah.
link |
So that was actually a, so we thought about several things.
link |
One is we wanted plotting.
link |
We wanted interactive environment.
link |
We wanted good documentation.
link |
These are things we knew, we wanted.
link |
The reality is those took about 10 years to evolve, right?
link |
Given the fact that we didn't have a big budget,
link |
it was all volunteer labor.
link |
It was sort of, when nthought got created
link |
and they started to try to find projects,
link |
people would pay for pieces
link |
and they were able to fund some of it.
link |
Not nearly enough to keep up with what was necessary.
link |
And no criticism, just simply the reality.
link |
I mean, it's hard to start a business
link |
and then do consulting and then also
link |
promote an open source project that's still fairly new.
link |
Cypo is fairly niche.
link |
We stayed connected all while I was a student,
link |
sorry, a professor.
link |
I went to BYU and started to teach.
link |
Electrical engineering, all the applied math courses.
link |
I loved teaching single processing,
link |
probability theory, electromagnetism.
link |
I was, if you look at writing my professor,
link |
which my kids loved to do,
link |
I wasn't, I got some bad reviews because people.
link |
What was the criticism?
link |
I would speak too high of a level.
link |
Like I definitely had a calibration problem
link |
coming out of graduate work
link |
where I hate to be condescending to people.
link |
Like I really have a ton of respect for people fundamentally.
link |
Like my fundamental thing is I respect people.
link |
Sometimes that can lead to a,
link |
I was thinking they had more knowledge than they did.
link |
And so I would just speak at a very high level,
link |
assume they got it.
link |
But they need to rise to the standard that you set.
link |
I mean, that's one of the,
link |
some of the greatest teachers do that.
link |
And that was kind of what was inspiring me.
link |
But you also have to,
link |
I cannot say I was articulate
link |
with some of the greatest teachers, right?
link |
I was, like one classic example,
link |
when I first taught at BYU,
link |
my very first class, it was overheads,
link |
transparencies, overheads.
link |
Before projectors were really that common,
link |
I taught transparencies.
link |
I'm writing my notes out.
link |
I go in, room's half dark.
link |
I just blaring through these transparencies.
link |
Here it is, here it is, here it is.
link |
And I did give a quiz after two weeks.
link |
No one knew anything.
link |
Nothing I had taught had gotten anywhere.
link |
And I realized, okay, I'm not, this is not working.
link |
So I put away the transparencies
link |
and I turned around and just started using the chalkboard.
link |
And what it did is it slowed me down, right?
link |
The chalkboard just slowed me down
link |
and gave people time to process and to think.
link |
And then that made me focus.
link |
My writing wasn't great on the chalkboard,
link |
but I really love that part of like the teaching.
link |
So that entered SciPy's world in terms of,
link |
we always understood that there's a didactic aspect
link |
of SciPy, kind of how do you take the knowledge
link |
and then produce it?
link |
The challenge we had was the scope.
link |
Like ultimately SciPy was everything, right?
link |
And so 2001, when it first came out,
link |
people were starting to use it.
link |
No, this is cool, this is a tool we actually use.
link |
At the same time, 2001 timeframe,
link |
there was a little bit of like the Hubble Space Telescope,
link |
the folks at Hubble that started to say,
link |
hey, Python, we're gonna use Python
link |
for processing images from Hubble.
link |
And so Perry Greenfield was a good friend
link |
in running that program.
link |
And he had called me before I left WIU and said,
link |
you know, we wanna do this,
link |
but numeric actually has some challenges in terms of,
link |
you know, it's not, the array doesn't have enough types.
link |
We need more operations.
link |
You know, broadcasting needs to be a little more settled.
link |
They wanted record arrays.
link |
They wanted, you know, record arrays are like a data frame,
link |
but a little bit different,
link |
but they wanted more structured data.
link |
So he had called me even early on then,
link |
and he said, you know, what,
link |
would you wanna work on something to make this work?
link |
And I said, yeah, I'm interested, but I'm going here,
link |
and I, you know, we'll see if I have time.
link |
So in the meantime, while I was teaching
link |
and SciPy was emerging, and I had a student,
link |
I was constantly, while I was teaching,
link |
trying to figure a way to fund this stuff.
link |
So I had a graduate student, my only graduate student,
link |
a Chinese fellow, Liu Hongze is his name, great guy.
link |
He wrote a bunch of stuff for iterative linear algebra,
link |
like got into writing some of the iterative
link |
linear algebra tools that are currently there in SciPy,
link |
and they've gotten better since,
link |
but this is in 2005, kept working on SciPy,
link |
but Perry has started working on a replacement
link |
to numeric called NumArray.
link |
And in 2004, a package called ND Image,
link |
it was an image processing library
link |
that was written for NumArray,
link |
and it had in it a morphology tool.
link |
I don't know if you know what morphology is.
link |
It's open, dilations, closed, you know,
link |
there was sort of this, as a medical imaging student,
link |
I knew what it was,
link |
because it was used in segmentation a lot.
link |
And in fact, I'd wanted to do something like that
link |
in Python, in SciPy, but just had never gotten around to it.
link |
So when it came out, but it worked only on NumArray,
link |
and SciPy needed numeric,
link |
and so we effectively had the beginning of this split.
link |
And numeric and NumArray didn't share data,
link |
they were just two, so you could have a gigabyte
link |
of numeric, NumArray data, and gigabyte of numeric data,
link |
and they wouldn't share it.
link |
And so you had these,
link |
then you had these scientific libraries written on top.
link |
I got really bugged by that.
link |
I got really like, oh man, this is not good,
link |
we're not cooperating now,
link |
we're sort of redoing each other's work,
link |
and we're just this young community.
link |
So that's what led me, even though I knew it was risky,
link |
because my, you know, I was on a tenure track position,
link |
2004 I got reviewed.
link |
They said, hey, things are going okay,
link |
you're doing well, paper's coming out,
link |
but you're kind of spending a lot of time
link |
doing this open source stuff, maybe do a little less of that,
link |
and a little more of the paper writing and grant writing,
link |
which was naive, but it was definitely the thinking.
link |
You're basically creating a thing
link |
which enables science in the 21st century.
link |
Maybe don't emphasize that so much in your free year tenure.
link |
It illustrates some of the challenges.
link |
It does, and it's, people mean well.
link |
Like, but we've gotten broken in a bunch of ways.
link |
Certain things, programming,
link |
understanding the role of software engineering,
link |
programming in society is a little bit lacking.
link |
Now, I was in electrical engineering position.
link |
That's even worse there.
link |
Yeah, it was very, they were very focused,
link |
and so, you know, good people, and I had a great time,
link |
I loved my time, I loved my teaching,
link |
I loved all the things I did there.
link |
The problem was, the split was happening
link |
in this community that I loved, right?
link |
I saw people, and I went, oh my gosh,
link |
this is gonna be, this is not great,
link |
and so I happened, you know, fate,
link |
I had a class I had signed up for,
link |
it's a, I was trying to build an MRI system,
link |
so I had a kind of a radio, instead of a radio,
link |
a digital radio class, it was a digital MRI class.
link |
And I had people sign up, two people signed up,
link |
then they dropped, and so I had nobody in this class.
link |
So, and I didn't have any other courses to teach,
link |
and I thought, oh, I've got some time,
link |
and I'll just write, I'll just write a replace,
link |
a merger of Numerica Numeray.
link |
Like, I'll basically take the numeric code base
link |
at the features Numeray was adding,
link |
and then kind of come up with a single array library
link |
that everybody can use.
link |
So that's where NumPy came from,
link |
was my thinking, hey, I can do this,
link |
and who else is going to?
link |
Because at that point, I'd been around the community
link |
long enough, and I'd written enough C code,
link |
I knew, I knew the structures, and I,
link |
in fact, my first contribution to numeric
link |
had been writing the CAPI documentation
link |
that went in the first documentation for NumPy,
link |
for numeric, sorry, this is Paul DuBois,
link |
David Asher, Conrad Hinson, and myself.
link |
I got credit because I wrote this chapter,
link |
which is all the CAPI of Numerica, all the C stuff.
link |
So I said, I'm probably the one to do it,
link |
and nobody else is gonna do this.
link |
So it was sort of, out of a sense of duty and passion,
link |
knowing that, eh, I don't think my academic,
link |
I don't think the department here is gonna appreciate this,
link |
but it's the right thing to do.
link |
Can we just link on that moment?
link |
Because the importance of the way you thought
link |
and the action you took, I feel is understated
link |
and is rare and I would love to see so much more of it
link |
because what happens as the tools become more popular,
link |
there's a split that happens.
link |
And it's a truly heroic and impactful action
link |
to in those early, in that early split,
link |
to step up and it's like great leaders throughout history,
link |
like get, what is the brave heart,
link |
like get on a horse and rile the troops
link |
because I think that can have, make a big difference.
link |
We have TensorFlow versus PyTorch
link |
in the machine learning community.
link |
We have the same problem today.
link |
It's actually bigger.
link |
I wonder if it's possible in the early days
link |
to rally the troops.
link |
It is possible, especially in the early days.
link |
The longer it goes, the harder, right?
link |
The more energy in the factions, the harder.
link |
But in the early days, it is possible
link |
and it's extremely helpful
link |
and there's a willingness there,
link |
but the challenge is there's just not a willingness
link |
There's not a willingness to, you know,
link |
like I was literally walking into a field
link |
saying I'm going to do this
link |
and here I am, like, you know,
link |
I have five kids at home now.
link |
Sometimes my wife hears these stories
link |
and she's like, you did what?
link |
I thought we were going to,
link |
I thought you were actually on a path
link |
to make sure we had resources and money, but,
link |
but again, there's a, there's an aspect,
link |
I'm a very hopeful person.
link |
I'm an optimistic person by nature.
link |
I learned that about myself later on.
link |
And part of my, my religious beliefs
link |
actually lead to that.
link |
And it's why I hold them dear
link |
because it's actually how I feel about,
link |
that's what leads me to these attitudes,
link |
sort of this hopefulness and this sense of,
link |
yeah, it may not work out for me financially
link |
or maybe, but that's not the ultimate gain.
link |
Like that's a thing, but it's not,
link |
that's not the scorecard for me.
link |
And so I just wanted to be helpful
link |
and I knew, and partly because these SciPy conferences,
link |
because the maintenance conversations,
link |
I knew there was a lot of need for this, right?
link |
And so I had this, it wasn't like I was alone
link |
in terms of no feedback.
link |
I had these people who knew, but it was crazy.
link |
Like people who at the time said,
link |
yeah, we didn't think you'd be able to do it.
link |
We thought it was crazy.
link |
And also instructive, like practically speaking,
link |
that you had a cool feature
link |
that you were chasing the morphology, like the.
link |
Like it's not just like.
link |
There's an end result.
link |
It's not some visionary thing.
link |
I'm going to unite the community.
link |
You were like. Correct.
link |
You were actually practically,
link |
this is what one person actually could do
link |
and actually build.
link |
Cause that is important.
link |
Cause you can get over your skis.
link |
You can definitely get over your skis.
link |
And I had, in fact, this almost got me over my skis, right?
link |
I would say, well, in retrospect, I hate looking back.
link |
I can tell you all the flaws with NumPy, right?
link |
When I go into it, there's lots of stuff that I'm like,
link |
oh man, that's embarrassing.
link |
I wish I had somebody stop me with a wet fish there.
link |
Like I needed, like what I'd wished I'd had
link |
was somebody with more experience and certainly library
link |
writing and array library.
link |
There's like, I wish I had me.
link |
I could go back in time and go do this, do that.
link |
There's a more important thing.
link |
Cause there's things we did that are still there
link |
that are problematic, that created challenges for later.
link |
And I didn't know it at the time.
link |
Didn't understand how important that was.
link |
And in many cases, didn't know what to do.
link |
Like there was pieces of the design of NumPy.
link |
I didn't know what to do until five years ago.
link |
Now I know what they should have been, Ben.
link |
But I didn't know at the time and nobody,
link |
and I couldn't get the help.
link |
Anyway, so I wrote it.
link |
It took about, it took four months to write
link |
the first version, then about 14 months to make it usable.
link |
But it was, it wasn't, it was that first four months
link |
of intense writing, coding, getting something out the door
link |
that worked that was, it was, it was definitely challenging.
link |
And then the big thing I did was create a new type object
link |
That was probably the contribution.
link |
And then the fact that I added broad, not just broadcasting,
link |
but advanced indexing so that you could do masked indexing
link |
and indirect indexing instead of just slicing.
link |
So for people who don't know, and maybe you can elaborate,
link |
NumPy, I guess the vision in the narrowest sense
link |
is to have this object that represents
link |
n dimensional arrays.
link |
And like at any level of abstraction you want,
link |
but basically it could be a black box
link |
that you can investigate in ways that you would naturally
link |
want to investigate such objects.
link |
So you could do math on it easily.
link |
Math on it easily, yeah.
link |
So it had an associated library of math operations
link |
and effectively SciPy became an even larger operate set
link |
of math operations.
link |
So the key for me was I was going to write NumPy
link |
and then move SciPy to depend on NumPy.
link |
In fact, early on, one of the initial proposals
link |
was that we would just write SciPy
link |
and it would have the numeric object inside of it.
link |
And it'd be SciPy.array or something.
link |
That turned out to be problematic because numeric
link |
already had a little mini library of linear algebra
link |
and some functions, and it had enough momentum,
link |
enough users that nobody wanted to,
link |
they wanted backward compatibility.
link |
One of the big challenges of NumPy
link |
was I had to be backward compatible
link |
with both numeric and NumArray
link |
in order to allow both of those communities to come together.
link |
There was a ton of work in creating
link |
that backward compatibility
link |
that also created echoes in today's object.
link |
Like some of the complexity in today's object
link |
is actually from that goal of backward compatibility
link |
to these other communities,
link |
which if you didn't have that, you'd do something different,
link |
which is instructive because a lot of things are there.
link |
You think, what is that there for?
link |
It's like, well, it's a remnant.
link |
It's an artifact of its historical existence.
link |
By the way, I love the empathy
link |
and the lack of ego behind that
link |
because I feel, you see that in the split
link |
in the JavaScript framework, for example,
link |
the arbitrary branching.
link |
I think in order to unite people,
link |
you have to kind of put your ego aside
link |
and truly listen to others.
link |
What do you love about NumArray?
link |
What do you love about Numeric?
link |
Like actually get a sense,
link |
we were talking about languages earlier,
link |
sort of empathize to the culture,
link |
the people that love something about this particular API,
link |
some of the naming style
link |
or the actual usage patterns
link |
and truly understand them
link |
and so that you can create that same draw
link |
in the united thing. I completely agree.
link |
I completely agree.
link |
And you have to also have enough passion
link |
that you'll do it.
link |
It can't be just like a perfunctory,
link |
oh yes, I'll listen to you
link |
and then I'm not really that excited about it.
link |
So it really is an aspect,
link |
it's a philosophical, like there's a philia,
link |
there's a love of esteeming of others.
link |
It's actually at the heart of what,
link |
it's sort of a life philosophy for me, right?
link |
That I'm constantly pursuing and that helped,
link |
absolutely helped.
link |
Makes me wonder in a philosophical,
link |
like looking at human civilization as one object,
link |
it makes me wonder how we can copy and paste Travis's
link |
Well, some aspects, maybe.
link |
Some aspects, right, right, exactly.
link |
Well, it's a good question.
link |
How do we teach this?
link |
How do we encourage it?
link |
How do we lift it?
link |
Because so much of the software world,
link |
it's giant communities, right?
link |
But it seems like so much is moved by,
link |
like little individuals.
link |
You talk about like Linus Torvalds.
link |
It's like, could you have not,
link |
could you have had Linux without him?
link |
Yeah, Guido and Python.
link |
Well, the iPy community particularly,
link |
it's like I said, we wanted to build this big thing,
link |
but ultimately we didn't.
link |
What happened is we had Mavericks and champions
link |
like John Hunter who created Matplotlib.
link |
We had Fernando Perez who created iPython.
link |
And so we sort of inspired each other,
link |
but then it kind of, there's sort of a culture
link |
of this selfless giving, the stewardship mentality,
link |
as opposed to ownership mentality,
link |
but stewardship and community focused,
link |
community focused, but intentional work.
link |
Like not waiting for everybody else to do the work,
link |
but you're doing it for the benefit of others
link |
and not worried about what you're gonna get.
link |
You're not worried about the credit.
link |
You're not worried about what you're gonna get.
link |
You're worried about, I later realized
link |
that I have to worry a little about credit,
link |
not because I want the credit,
link |
because I want people to understand
link |
what led to the results.
link |
Like, I don't, it's not about me.
link |
It's I want to understand this is what led to the result.
link |
So let's like, I think doing,
link |
and this is what had no impact on the result.
link |
Like let's promote, just like you said,
link |
I want to promote the attributes
link |
that help make us better off.
link |
How do we make more of West McKinney?
link |
Like West McKinney was critical to the success of Python
link |
because of his creation of pandas,
link |
which is the roots of that were all the way back
link |
in numeric and num array and numpy,
link |
where numpy created an array of records.
link |
West started to use that almost like a data frame,
link |
except it's an array of records.
link |
And data frame, the challenge is,
link |
okay, if you want to augment it at another column,
link |
you have to insert, you have to do all this memory movement
link |
to insert a column.
link |
Whereas data frames became,
link |
oh, I'm going to have a loose collection of arrays.
link |
So it's a record of arrays that is a part of a data frame.
link |
And we thought about that back in the memory days,
link |
but West ended up doing the work to build it.
link |
And then also the operations that were relevant
link |
for data processing.
link |
What I noticed is just that each of these little things
link |
creates just another tick, another up.
link |
So numpy ultimately took a little while,
link |
about six months in, people started to join me,
link |
Francesc Altad, Robert Kern, Charles Harris.
link |
And these people are many of the unsung heroes, I would say.
link |
People who are, you know,
link |
they sometimes don't get the credit they deserve
link |
because they were critical both to support,
link |
like, you know, it's hard and you want,
link |
you need some support, people need support.
link |
And I needed just encouragement.
link |
And they were helping and encouraged by contributing.
link |
And once, the big thing for me was when John Hunter,
link |
he had previously done kind of a simple thing
link |
called numerics to kind of, you know, between numeric
link |
and numerae, he had a little high level tool
link |
that would just select each one for matplotlib.
link |
In 2006, he finally said,
link |
we're gonna just make numpy the dependency of matplotlib.
link |
As soon as he did that,
link |
and I remember specifically when he did that,
link |
I said, okay, we've done it.
link |
Like, that was when I knew we had to see success.
link |
Before then it was still unsure,
link |
but that kind of started a roller coaster.
link |
And then 2006 to 2009.
link |
And then I've been floored by what it's done.
link |
Like, I knew it would help.
link |
I had no idea how much it would help.
link |
And it has to do with, again, the language thing.
link |
It just, people started to think in terms of numpy.
link |
And that opened up a whole new way of thinking.
link |
And part of the story that you kind of mentioned,
link |
but maybe you can elaborate,
link |
is it seems like at some point in the story,
link |
Python took over science and data science.
link |
And bigger than that,
link |
the scientific community started to think like programmers
link |
or started to utilize the tools of computers to do,
link |
like at a scale that wasn't done with Fortran.
link |
Like at this gigantic scale,
link |
they started to open in their heart.
link |
And then Python was the thing.
link |
I mean, there's a few other competitors, I guess,
link |
but Python, I think, really, really took over.
link |
There's a lot of stories here
link |
that are kind of during this journey,
link |
because this is sort of the start of this journey in 2005, 2006.
link |
So my tenure committee, I applied for tenure in 2006, 2007.
link |
It came back, I split the department.
link |
I was very polarizing.
link |
I had some huge fans
link |
and then some people that said no way, right?
link |
So it was very, I was a polarizing figure in the department.
link |
It went all the way up to the university president.
link |
Ultimately, my department chair had the sway
link |
and they didn't say no.
link |
They said, come back in two years and do it again.
link |
And I went, eh, at that point, I was like,
link |
I mean, I had this interest in entrepreneurship,
link |
this interest in not the academic circles,
link |
not the, like, how do we make industry work?
link |
So I do have to give credit to that exploration of economics
link |
because that led me, oh, I had a lot of opinions.
link |
I was actually very libertarian at the time.
link |
And I still have some libertarian trends,
link |
but I'm more of a, I'm more of a collectivist libertarian.
link |
So you value broadly, philosophically freedom.
link |
I value broadly, philosophically freedom,
link |
but I also understand the power of communities,
link |
like the power of collective behavior.
link |
And so what's that balance, right?
link |
So by the time I was just,
link |
I gotta go out and explore this entrepreneur world.
link |
So I left academia.
link |
I said, no thanks, called my friend, Eric, here,
link |
who had, his company was going.
link |
I said, hey, could I join you and start this trend?
link |
And he, at that time they were using SciFi a lot.
link |
They were trying to get clients.
link |
And so I came down to Texas.
link |
And in Texas is where I sort of,
link |
it's my entrepreneur world, right?
link |
I left academia and went to entrepreneur world in 2007.
link |
So I moved here in 2007, kind of took a leap,
link |
knew nothing really about business,
link |
knew nothing about a lot of stuff there.
link |
There's, you know, for a long time,
link |
I've kept some connections to a lot of academics
link |
because I still value it.
link |
I still love the scientific tradition.
link |
I still value the essence and the soul and the heart
link |
of what is possible.
link |
Don't like a lot of the administration
link |
and the kind of, we can go into detail about why
link |
and where and how this happens,
link |
what are some of the challenges.
link |
I don't know, but I'm with you.
link |
So I'm still affiliated with MIT.
link |
I still love MIT because there's magic there.
link |
There's people I talk to, like researchers, faculty,
link |
in those conversations and the whiteboard
link |
and just the conversation, that's magic there.
link |
All the other stuff, the administration,
link |
all that kind of stuff seems to,
link |
you don't wanna say too harshly criticize
link |
sort of bureaucracies, but there's a lag
link |
that seems to get in the way of the magic.
link |
And I'm still have a lot of hope
link |
that that can change because I don't often see
link |
that particular type of magic elsewhere in the industry.
link |
So like we need that and we need that flame going.
link |
And it's the same thing as exactly as you said,
link |
it has the same kind of elements
link |
like the open source community does.
link |
And, but then if you, like the reason I stepped away,
link |
the reason I'm here, just like you did in Austin is like,
link |
if I wanna build one robot, I'll stay at MIT.
link |
But if I wanna build millions and make money enough
link |
to where I can explore the magic of that, then you can't.
link |
And I think that dance is...
link |
That translational dance has been lost a bit, right?
link |
And there's a lot of reasons for that.
link |
I'm not, I'm certainly not an expert on this stuff.
link |
I can opine like anybody else,
link |
but I realized that I wanted to explore entrepreneurship,
link |
which I, and really figure out,
link |
and it's been a driving passion for 20 years, 25 years.
link |
How do we connect capital markets and company?
link |
Cause again, I fell in love with the notion of,
link |
oh, profit seeking on its own is not a bad thing.
link |
It's actually a coordination mechanism
link |
for allocating resources that, you know,
link |
in an emergent way, right?
link |
That respects everybody's opinions, right?
link |
So this is actually powerful.
link |
So I say all the time, when I make a company
link |
and we do something that makes profit,
link |
what we're saying is, hey,
link |
we're collecting of the world's resources
link |
and voluntarily people are asking us
link |
to do something that they like.
link |
And that's a huge deal.
link |
And so I really liked that energy.
link |
So that's what I came to do and to learn
link |
and to try to figure out.
link |
And that's what I've been kind of stumbling through
link |
since for the past 14 years.
link |
And so you were still working at NoPi.
link |
So NoPi was just emerging.
link |
One of the things I've done,
link |
it's worth mentioning because it emphasizes
link |
the exploratory nature of my thinking at the time.
link |
I said, well, I don't know how to fund this thing.
link |
I've got a graduate student I'm paying for
link |
and I've got no funding for him.
link |
And I had done some fundraising from the public
link |
to try to get public fundraisers in my lab.
link |
I didn't really wanna go out
link |
and just do the fundraising circuit
link |
the way it's traditionally done.
link |
So I wrote a book and I said, I'm gonna write a book
link |
and I'm gonna charge for it.
link |
It was called Guide to NoPi.
link |
And so ultimately NoPi became
link |
documentation driven development
link |
because I basically wrote the book
link |
and made sure the stuff worked or the book would work.
link |
So it really helped actually make NoPi become a thing.
link |
So writing that book,
link |
and it's not a page turner.
link |
Guide to NoPi is not a book you pick up
link |
and go, oh, this is great, over the fire.
link |
But it's where you could find the details,
link |
like how'd all this work.
link |
And a lot of people love that book.
link |
And so a lot of people ended up,
link |
so I said, look, I need to, so I'm gonna charge for it.
link |
And I got some flack for that.
link |
Not that much, just probably five angry messages,
link |
people yelling at me saying I was a bad guy
link |
for charging for this book.
link |
Was one of them Richard Stallman?
link |
No, I haven't really had any interaction with him personally,
link |
like I said, but there were a few,
link |
but actually surprisingly not.
link |
There was actually a lot of people like,
link |
no, it's fine, you can charge for a book.
link |
That's no big deal.
link |
We know that's a way you can try to make money
link |
around open source.
link |
So what I did, I did it in an interesting way.
link |
I said, well, kind of my ideas around IP law and stuff.
link |
I love the idea you can share something, you can spread it.
link |
Like once it's, the fact that you have a thing
link |
and copying is free, but the creation is not free.
link |
So how do you fund the creation and allow the copying?
link |
And in software, it's a little more complicated than that
link |
because creation is actually a continuous thing.
link |
It's not like you build a widget and it's done.
link |
It's sort of a process of emerging
link |
and continuing to create.
link |
But I wrote the book
link |
and had this market determined price thing.
link |
I said, look, I need, I think I said 250,000.
link |
If I make 250,000 from this book, I'll make it free.
link |
So as soon as I get that much money,
link |
or I said five years, so there's a time limit.
link |
Like it's not forever.
link |
That's really cool.
link |
I released it on this.
link |
And it's actually interesting
link |
because one of the people
link |
who also thought that was interesting
link |
ended up being Chris White,
link |
who was the director of DARPA project
link |
that we got funding through at Anaconda.
link |
And the reason he even called us back
link |
is because he remembered my name from this book
link |
and he thought that was interesting.
link |
And so even though we hadn't gone to the demo days,
link |
we applied and the people said, yeah,
link |
nobody ever gets this without coming to the demo day first.
link |
This is the first time I've seen it.
link |
But it's because I knew, you know,
link |
Chris had done this and had this interaction.
link |
So it did have impact.
link |
I was actually really, really pleased by the result.
link |
I mean, I ended up in three years, I made 90,000.
link |
So sold 30,000 copies by myself.
link |
I just put it up on, you know, use PayPal and sold it.
link |
And that was my first taste of kind of, okay,
link |
this can work to some degree.
link |
And I, you know, all over the world, right?
link |
From Germany to Japan to, it was actually, it did work.
link |
And so I appreciated the fact that PayPal existed
link |
and I had a way to get the money, the distribution was simple.
link |
This is pre Amazon book stuff.
link |
So it was just publishing a website.
link |
It was the popularity of SciPy emerging
link |
and getting company usage.
link |
I ended up not letting it go the five years
link |
and not trying to make the full amount
link |
because, you know, a year and a half later,
link |
I was at Enthought.
link |
I had left academia as an Enthought
link |
and I kind of had a full time job.
link |
And then actually what happened is the documentation people,
link |
there's a group that said, hey,
link |
we want to do documentation for SciPy as a collective.
link |
And they're essentially needing the stuff in the book, right?
link |
And so they kind of ask,
link |
hey, could we just use the stuff in your book?
link |
And at that point I said, yeah, I'll just open it up.
link |
So that's, but it has served its purpose.
link |
And the money that I made actually funded my grad student.
link |
Like it was actually, you know,
link |
I paid him 25,000 a year out of that money.
link |
So the funny thing is if you do a very similar
link |
kind of experiment now with NumPy or something like it,
link |
you could probably make a lot more.
link |
It's probably true.
link |
Because of the tooling and the community building.
link |
Like the, and social media,
link |
that there's just a virality to that kind of idea.
link |
There'd be things to do.
link |
I've thought about that.
link |
And really I thought about a couple of books
link |
or a couple of things that could be done there.
link |
And I just haven't, right?
link |
Even, I tried to hire a ghostwriter this year too
link |
to see if that could help, but it didn't.
link |
But part of my problem is this,
link |
I've been so excited by a number of things
link |
that have stemmed from that.
link |
Like, so I came here, worked at Enthought for four years,
link |
graciously, Eric made me president.
link |
Then we started to work closely together.
link |
We actually helped him buy out his partner.
link |
It didn't end great.
link |
Like unfortunately Eric and I aren't real,
link |
aren't friends now.
link |
I still respect him.
link |
I have a lot, I wish we were,
link |
but he didn't like the fact that Peter and I
link |
started Anaconda, right?
link |
That was not, I mean, so there's two sides to that story.
link |
So I'm not gonna go into it, right?
link |
But you, as human beings
link |
and you wish you still could be friends.
link |
I mean, that's a story of great minds
link |
building great companies.
link |
Somehow it's sad that when there's that kind of.
link |
And I hold him in esteem.
link |
I'm grateful for him.
link |
I think Enthought still exists.
link |
They're doing great work helping scientists.
link |
They still run the SciPy conference.
link |
They have an R&D platform they're selling now
link |
that's a tool that you can go get today, right?
link |
So Enthought has played a role in the SciPy
link |
in supporting the community around SciPy, I would say.
link |
They ended up not being able to,
link |
they ended up building a tool suite
link |
to write GUI applications.
link |
Like that's where they could actually make
link |
that the business could work.
link |
And so supporting SciPy and NumPy itself
link |
wasn't as possible.
link |
Like they didn't, they tried.
link |
I mean, it was not just because,
link |
it was just because of the business aspect.
link |
So, and I wanted to build a company that could do,
link |
that could get venture funding, right?
link |
I mean, that's a longer story.
link |
We could talk a lot about that, but.
link |
And that's where Anaconda came to be.
link |
That's where Anaconda came to be.
link |
So let me ask you, it's a little bit for fun
link |
because you built this amazing thing.
link |
And so let's talk about like an old warrior
link |
looking over old battles.
link |
You've, you know, there's a sad letter in 2012
link |
that you wrote to the NumPy mailing list
link |
announcing that you're leaving NumPy.
link |
And some of the things you've listed
link |
as some of the things you regret
link |
or not regret necessarily, but some things to think about.
link |
If you could go back and you could fix stuff about NumPy
link |
or both sort of in a personal level,
link |
but also like looking forward,
link |
what kind of things would you like to see changed?
link |
So I think there's technical questions
link |
and social questions right there.
link |
First of all, you know, I wrote NumPy as a service
link |
and I spent a lot of time doing it.
link |
And then other people came help make it happen.
link |
NumPy succeeded because the work of a lot of people, right?
link |
So it's important to understand that.
link |
I'm grateful for the opportunity,
link |
the role I had, I could play
link |
and grateful that things I did had an impact,
link |
but they only had the impact they had
link |
because the other people that came to the story.
link |
And so they were essential,
link |
but the way data types were handled,
link |
the way data types, we had array scalers, for example,
link |
that are really just a substitute for a type concept, right?
link |
So we had array scalers or actual Python objects
link |
so that there's for every, for a 32 bit float
link |
or a 16 bit float or a 16 bit integer,
link |
Python doesn't have a natural,
link |
it's just one integer, there's one float.
link |
Well, what about these lower precision types,
link |
these larger precision types?
link |
So we had them in NumPy
link |
so that you could have a collection of them,
link |
but then have an object in Python that was one of them.
link |
And there's questions about like in retrospect,
link |
I wouldn't have created those
link |
if it improved the type system.
link |
And like made the type system actually a Python type system
link |
as opposed to currently,
link |
it's a Python one level type system.
link |
I don't know if you know the difference
link |
between Python one, Python two,
link |
it's kind of technical, kind of depth,
link |
but Python two, one of its big things that Guido did,
link |
it was really brilliant.
link |
It was the actually Python one,
link |
all classes, new objects were one.
link |
If you as a user wrote a class,
link |
it was an instance of a single Python type
link |
called the class type, right?
link |
In Python two, he used a meta typing hook
link |
to actually go, oh, we can extend this
link |
and have users write classes that are new types.
link |
So he was able to have your user classes be actual types
link |
and the Python type system got a lot more rich.
link |
I barely understood that at the time that NumPy was written.
link |
And so I essentially in NumPy created a type system
link |
that was Python one era.
link |
It was every D type is an instance of the same type
link |
as opposed to having new D types be really just Python types
link |
with additional metadata.
link |
What's the cost of that?
link |
Is it efficiency, is it usability?
link |
It's usability primarily.
link |
The cost isn't really efficiency.
link |
It's the fact that it's clumsy to create new types.
link |
And then one of the challenges,
link |
you wanna create new types.
link |
You wanna quaternion type or you wanna add a new posit type
link |
or you wanna, so it's hard.
link |
And now, if we had done that well,
link |
when Numba came on the scene
link |
where we could actually compile Python code,
link |
it would integrate with that type system much cleaner.
link |
And now all of a sudden you could do gradual typing
link |
You could actually have Python when you add Numba
link |
plus better typing, could actually be a,
link |
you'd smooth out a lot of rough edges.
link |
But there's already, there's like,
link |
but are you talking about from the perspective
link |
of developers within NumPy or users of NumPy?
link |
Developers of new, not really users of NumPy so much.
link |
It's the development of NumPy.
link |
So you're thinking about like how to design NumPy
link |
so that it's contributors.
link |
Yeah, the contributors, it's easier.
link |
It's less work to make it better and to keep it maintained.
link |
And where that's impacted things, for example,
link |
Like all of a sudden GPUs start getting added
link |
and we don't have them in NumPy.
link |
Like NumPy should just work on GPUs.
link |
The fact that we'd have to download a whole other object
link |
called Kupy to have arrays on GPUs
link |
is just an artifact of history.
link |
Like there's no fundamental reason for it.
link |
Well, that's really interesting.
link |
If we could sort of go on that tangent briefly
link |
is you have PyTorch and other libraries like TensorFlow
link |
that basically tried to mimic NumPy.
link |
Like you've created a sort of platonic form
link |
of multi dimension. Basically, yeah.
link |
Well, and the problem was I didn't realize that.
link |
Platonic form has a lot of edges.
link |
They're like, well, we should cut those out
link |
before we present it.
link |
So I wonder if you can comment,
link |
is there like a difference between their implementations?
link |
Do you wish that they were all using NumPy
link |
or like in this abstraction of GPU?
link |
And sorry to interrupt that there's GPUs, ASICs.
link |
There might be other neuromorphic computing.
link |
There might be other kind of,
link |
or the aliens will come with a new kind of computer.
link |
Like an abstraction that NumPy should just operate nicely
link |
over the things that are more and more
link |
and smarter and smarter with this multi dimensional arrays.
link |
There's several comments there.
link |
We are working on something now called data dash APIs.org.
link |
Data dash API.org, you can go there today.
link |
And it's our answer.
link |
It's me and Rolf and Athen and Aaron
link |
and a lot of companies are helping us at Quansight Labs.
link |
It's not unifying all the arrays.
link |
It's creating an API that is unified.
link |
So we do care about this
link |
and we're trying to work through it.
link |
I actually had the chance to go and meet
link |
with the TensorFlow team and the PyTorch team
link |
and talk to them after exiting Anaconda.
link |
Just talking about,
link |
because the first year after leaving Anaconda in 2018,
link |
I became deeply aware of this and realized that,
link |
oh, this split in the array community that exists today
link |
makes what I was concerned about in 2005 pretty parochial.
link |
It's a lot worse, right?
link |
Now there's a lot more people.
link |
So perhaps the industry can sustain more stacks, right?
link |
There's a lot of money,
link |
but it makes it a lot less efficient.
link |
I mean, but I've also learned to appreciate,
link |
it's okay to have some competition.
link |
It's okay to have different implementations,
link |
but it's better if you can at least refactor some parts.
link |
I mean, you're gonna be more efficient
link |
if you can refactor parts.
link |
It's nice to have competition over things,
link |
over what is nice to have competition.
link |
They're innovative.
link |
And then maybe on the infrastructure,
link |
whatever, however you define infrastructure,
link |
that maybe it's nice to have come together.
link |
And I think, but it was interesting to hear the stories.
link |
I mean, TensorFlow came out of a C++ library,
link |
Jeff Dean wrote, I think,
link |
that was basically how they were doing inference, right?
link |
And then they realized, oh,
link |
we could do this TensorFlow thing.
link |
That C++ library, then what was interesting to me
link |
was the fact that both Google and Facebook did not,
link |
it's not like they supported Python or NumPy initially.
link |
They just realized they had to.
link |
They came to this world and then all the users were like,
link |
hey, where's the NumPy interface?
link |
Oh, and then they kind of came late to it
link |
and then they had these bolt ons.
link |
TensorFlow's bolt on, I don't mean to offend,
link |
but it was so bad.
link |
It's the first time that I'm usually,
link |
I mean, one of the challenges I have
link |
is I don't criticize enough in the sense
link |
that I don't give people input enough, you know, if.
link |
I think it's universally agreed upon
link |
that the bolt ons on TensorFlow were.
link |
But I went to, it was a talk given at Mallorca in Spain
link |
and a great guy came and gave a talk and I said,
link |
you should never show that API again
link |
at a PyData conference.
link |
Like that was, that's terrible.
link |
Like you're taking this beautiful system we've created
link |
and like you're corrupting all these poor Python people,
link |
forcing them to write code like that
link |
or thinking they should.
link |
Fortunately, you know, they adopted Keras as their,
link |
and Keras is better.
link |
And so Keras, TensorFlow is fine, is reasonable,
link |
but they bolted it on.
link |
Like Facebook had their own C++ library for doing inference
link |
and they also had the same reaction, they had to do this.
link |
One big difference is Facebook,
link |
maybe because of the way it's situated in part of fair,
link |
part of the research library,
link |
TensorFlow is definitely used and, you know,
link |
they have to make, they couldn't just open it up
link |
and let the community, you know, change what that is.
link |
Cause I guess they were worried
link |
about disrupting their operations.
link |
Facebook's been much more open to having community input
link |
on the structure itself.
link |
Whereas Google and TensorFlow,
link |
they're really eager to have community users,
link |
people use it and build the infrastructure,
link |
but it's much more walled.
link |
Like it's harder to become a contributor to TensorFlow.
link |
And it's also, this is very difficult question to answer
link |
and don't mean to be throwing shade at anybody,
link |
but you have to wonder, it's the Microsoft question
link |
of when you have a tool like PyTorch or TensorFlow,
link |
how much are you tending to the hackers
link |
and how much are you tending to the big corporate clients?
link |
So like the ones that,
link |
do you tend to the millions of people
link |
that are giving you almost no money,
link |
or do you tend to the few
link |
that are giving you a ton of money?
link |
I tend to stand with the people.
link |
Cause I feel like if you nurture the hackers,
link |
you will make the right decisions in the longterm
link |
that will make the companies happy.
link |
I lean that way too.
link |
But then you have to find the right dance.
link |
But it's a balance.
link |
Cause you can lean to the hackers and run out of money.
link |
Which has been some of the challenge I've faced
link |
in the sense that,
link |
like I would look at some of the experiments,
link |
like NumPy, the fact that we have this split
link |
is a factor of I wasn't able to collect more money
link |
towards NumPy development.
link |
I mean, I didn't succeed in the early days
link |
of getting enough financial contribution to NumPy
link |
so that they could work on it.
link |
I couldn't work on it full time.
link |
I had to just catch an hour here, an hour there.
link |
And I basically not liked that.
link |
Like I've wanted to be able to do something about that
link |
for a long time and try to figure out how,
link |
well, there's lots of ways.
link |
I mean, possibly one could say,
link |
we had an offer from Microsoft
link |
at early days of Anaconda.
link |
2014, they offered to come buy us, right?
link |
The problem was the right people at Microsoft
link |
didn't offer to buy us.
link |
And they were still,
link |
they were, it was really a,
link |
we were like a second,
link |
they had really bought, they just bought R,
link |
the R company called,
link |
it was not R studio,
link |
but it was another R company that was emergent.
link |
And it was kind of a,
link |
well, we should also get a Python play,
link |
but they were really doubling down on R.
link |
And so it was like,
link |
it was where you would go to die.
link |
So it's not, it wasn't,
link |
it was before Satya was there.
link |
Satya had just started.
link |
And the offer was coming from someone
link |
two levels down from him.
link |
And if it had come from Scott Guthrie,
link |
so I got a chance to meet Scott Guthrie,
link |
great guy, I like him.
link |
If an offer had come from him,
link |
probably would be at Microsoft right now.
link |
That'd be fascinating.
link |
That would be really nice actually,
link |
especially given what Microsoft has since done
link |
for the open source community and all those things.
link |
Yes, I think they're doing well.
link |
I really like some of the stuff they've been doing.
link |
They're still working,
link |
and they've, you know,
link |
they've hired Guido now,
link |
and they've hired a lot of Python developers.
link |
Wait, Guido's not at Microsoft?
link |
Yeah, he works at Microsoft.
link |
Which, he retired,
link |
then he came out of retirement,
link |
and he's working now.
link |
I was just talking to him,
link |
and he didn't mention this person.
link |
I should investigate this further.
link |
Because I know he loved Dropbox,
link |
but I wasn't sure what he was doing,
link |
Well, he was kind of saying he'd retire,
link |
but, and it's literally been five years
link |
since I last sat down and really talked to Guido.
link |
Guido's a technology expert, right?
link |
He's a, so I came,
link |
I was excited because I'd finally figured out
link |
the type system for NumPy.
link |
I wanted to kind of talk about that with him,
link |
and I kind of overwhelmed him.
link |
Could you stay in that,
link |
just for a brief moment,
link |
because you're a fascinating person
link |
in the history of programming.
link |
He is a fascinating person.
link |
What have you learned from Guido
link |
about programming, about life?
link |
I've been a fan of Guido's.
link |
You know, we have a chance to talk.
link |
Some, I wouldn't say, you know,
link |
we talk all the time.
link |
He may, but we talk enough to,
link |
in fact, when I first started NumPy,
link |
one of the first things I did was I had a,
link |
I asked Guido for a meeting
link |
with him and Paul Dubois in San Mateo.
link |
And I went and met him for lunch.
link |
And basically, to say,
link |
maybe we can actually,
link |
part of the strategy for NumPy
link |
was to get it into Python 3,
link |
and maybe be part of Python.
link |
And so we talked about that.
link |
That's a cool conversation.
link |
And about that approach, right?
link |
I would have loved to be a flyer in the water.
link |
And over the years for Guido,
link |
Like, he was willing to listen to people's ideas.
link |
And over the years,
link |
now generally, you know,
link |
I'm not saying universally that's been true,
link |
but generally that's been true.
link |
So he's willing to listen.
link |
He's willing to defer.
link |
Like on the scientific side,
link |
he would just kind of defer.
link |
He didn't really always understand
link |
what we were doing.
link |
One place where he didn't enough
link |
was we missed a matrix multiply operator.
link |
Like that finally got added to Python,
link |
but about 10 years later than it should have.
link |
But the reason was because nobody,
link |
it takes a lot of effort.
link |
And I learned this while I was writing NumPy.
link |
I also wrote tools to Python.
link |
I began with Python Dev,
link |
and I added some pieces to Python.
link |
Like the memory view object.
link |
I wanted the structure of NumPy into Python.
link |
So we didn't get NumPy into Python,
link |
but we got the basic structure of it into Python.
link |
Like, so you could build on it.
link |
Nobody did for a while,
link |
but eventually database authors started to.
link |
And it's a lot better.
link |
And also Antoine Petrou and Stefan Krah
link |
actually fixed the memory view object.
link |
Cause I wrote the underlying infrastructure in C,
link |
but the Python exposure was terrible
link |
until they came in and fixed it.
link |
Partly because I was writing NumPy,
link |
and NumPy was the Python exposure.
link |
I didn't really care about
link |
if you didn't have NumPy installed.
link |
Anyway, Guido opened up ideas,
link |
technologically brilliant.
link |
Like really, I really got a lot of respect for him
link |
when I saw what he did
link |
with this type class merger thing.
link |
It was actually tricky, right?
link |
And then willing to share, willing to share his ideas.
link |
So the other thing early on in 1998,
link |
I said, I wrote my first extension module.
link |
The reason I could is because he'd written this blog post
link |
on how to do reference counting, right?
link |
And without it, I would have been lost, right?
link |
But he was willing to at least try to write this post.
link |
And so he's been motivated early on with Python.
link |
There's a computer science for everybody.
link |
You kind of have this early on desire to,
link |
oh, maybe we should be pushing programming to more people.
link |
So he had this populist notion, I guess,
link |
or populist sense to learn that there's a certain skill,
link |
and I've seen it in other people too,
link |
of engaging with contributors sufficiently to,
link |
because when somebody engaged with you
link |
and wants to contribute to you,
link |
if you ignore them, they go away.
link |
So building that early contributor base
link |
requires real engagement with other people.
link |
And he would do that.
link |
Can you also comment on this tragic stepping down
link |
from his position as the benevolent dictator for life
link |
over the wars, you know?
link |
The Walrus operator?
link |
The Walrus operator was the last battle.
link |
I don't know if that's the cause of it,
link |
but there's this, for people who don't know,
link |
you can look up, there's the Walrus operator,
link |
which looks like a colon and equal sign.
link |
Yeah, colon, equal sign.
link |
And it actually does maybe the thing
link |
that an equal sign should be doing.
link |
Yeah, maybe, right, exactly.
link |
But it's just historically,
link |
equal sign means something else.
link |
It just means assignment.
link |
So he stepped down over this.
link |
What do you think about the pressure of leadership?
link |
It's something that, you mentioned the letter I wrote
link |
in NumPy at the time.
link |
That was a hard time, actually.
link |
I mean, there's been really hard times.
link |
You get criticized, right?
link |
And you get pushed, and you get,
link |
not everybody loves what you do.
link |
Like anytime you do anything that has impact at all,
link |
you're not universally loved, right?
link |
You get some real critics.
link |
And that's an important energy,
link |
because it's impossible for you to do everything right.
link |
You need people to be pushing.
link |
But sometimes people can get mean, right?
link |
People can, I prefer to give people the benefit of the doubt.
link |
I don't immediately assume they have bad intentions.
link |
And maybe for other, maybe that doesn't happen for everybody.
link |
For whatever reason, their past,
link |
their experiences with people, they sometimes have bad,
link |
so they immediately attribute to you bad intentions.
link |
So you're like, where did this come from?
link |
I mean, I'm definitely open to criticism,
link |
but I think you're misinterpreting the whole point.
link |
Because I would get that, certainly when I started Anaconda.
link |
Sometimes I say to people,
link |
I care enough about entrepreneurship
link |
to make some open source people uncomfortable.
link |
And I care enough about open source
link |
to make investors uncomfortable.
link |
So I sort of, you create kind of doubters on both sides.
link |
So when you have, and this is just a plea
link |
to the listener and the public, I've noticed this too,
link |
that there's a tendency, and social media makes this worse,
link |
when you don't have perfect information about the situation,
link |
you tend to fill the gaps with the worst possible,
link |
or at least a bad story that fills those gaps.
link |
And I think it's good to live life,
link |
maybe not fully naively, but filling in the gaps
link |
with the good, with the best, with the positive,
link |
with the hopeful explanation of why you see this.
link |
So if you see somebody like you trying to make money
link |
on a book about an umpire,
link |
there's a million stories around that that are positive.
link |
And those are good to think about,
link |
to project positive intent on the people.
link |
Because for many reasons, usually because people are good
link |
and they do have good intent.
link |
And also when you project that positive intent,
link |
people will step up to that too.
link |
It's a great point.
link |
It has this kind of viral nature to it.
link |
And of course with Twitter, early on figured out,
link |
and Facebook is that they can make a lot of money
link |
and engagement from the negative.
link |
So there's this, we're fighting this mechanism.
link |
Which is challenging.
link |
It's just easier to be.
link |
And then for some reason, something in our minds
link |
really enjoys sharing that and getting all excited
link |
about the negativity.
link |
Some protective mechanism perhaps that we're gonna get eaten
link |
if we don't, yeah.
link |
For us to be effective as a group of people
link |
in a software engineering project,
link |
you have to project positive intent, I think.
link |
And I think that's very,
link |
and so that happens in this space.
link |
But Python has done a reasonable job in the past,
link |
but here is a situation where I think it started
link |
to get this pressure where it didn't.
link |
I really didn't, I didn't know enough about what happened.
link |
I've talked to several people about it.
link |
And I know most of the steering committee members today,
link |
one person nominated me for that role,
link |
but it's the wrong role for me right now, right?
link |
I have a lot of respect for the Python developer space
link |
and the Python developers.
link |
I also understand the gap between computer science
link |
Python developers and array programming developers
link |
or science developers.
link |
And in fact, Python succeeds in the array space
link |
the more it has people in that boundary.
link |
And there's often very few.
link |
Like I was playing a role in that boundary
link |
and working like everything to try to keep up
link |
with even what Guido was saying, like I'm a C programmer,
link |
but not a computer scientist.
link |
Like I was an engineer and physicist and mathematician,
link |
and I didn't always understand
link |
what they were talking about
link |
and why they would have opinions the way they did.
link |
So, you know, you have to listen and try to understand.
link |
Then you also have to explain your point of view
link |
in a way they can understand.
link |
And that takes a lot of work.
link |
And that communication is always the challenge.
link |
And it's just what we're describing here
link |
about the negativity is just another form of that.
link |
Like how do we come together?
link |
And it does appear we're wired anyway
link |
to at least have a, there's a part of us
link |
that will enemy, you know, friend, enemy.
link |
And we see, yeah, it's like,
link |
why are we wiring on the enemy front?
link |
So why are we pushing that?
link |
Why are we promoting that so deeply?
link |
Assume friend until proven otherwise.
link |
So, cause you have such a fascinating mind in all of this.
link |
Let me just ask you these questions.
link |
So one interesting side on the Python history
link |
is the move from Python two to Python three.
link |
You mentioned move from Python one to Python two,
link |
but the move from Python two to Python three
link |
is a little bit interesting
link |
because it took a very long time.
link |
It broke, you know, quite a small way
link |
backward compatibility, but even that small way
link |
seemed to have been very painful for people.
link |
Is there lessons you draw?
link |
Oh man, tons of lessons.
link |
From how long it took and how painful it seemed to be?
link |
Yeah, tons of lessons.
link |
Well, I mentioned here earlier
link |
that NumPy was written in 2005.
link |
It was in 2005 that I actually went to Guido
link |
to talk about getting NumPy into Python three.
link |
Like my strategy was to,
link |
oh, we were moving to Python three.
link |
Let's have that be, and it seems funny in retrospect
link |
because like, wait, Python three,
link |
that was in 2020, right?
link |
When we finally ended the support for Python two
link |
The reason it took a long time,
link |
a lot of time, I think it was because one of the things is
link |
there wasn't much to like about Python three.
link |
3.0, 3.1, it really wasn't until 3.3.
link |
Like I consider Python 3.3 to be Python 3.0.
link |
But it wasn't until Python 3.3
link |
that I felt there's enough stuff in it
link |
to make it worth anybody using it, right?
link |
And then 3.4 started to be, oh yeah, I want that.
link |
And then 3.5 as the matrix multiply operator,
link |
and now it's like, okay, we gotta use that.
link |
Plus the libraries that started leveraging
link |
some of the features of Python three.
link |
So it really, the challenge was it was,
link |
but it also illustrated a truism that, you know,
link |
when you have inertia,
link |
when you have a group of people using something,
link |
it's really hard to move them away from it.
link |
You can't just change the world on them.
link |
And Python three, you know, made some,
link |
I think it fixed some things Guido had always hated.
link |
I don't think he didn't like the fact
link |
that print was a statement.
link |
He wanted to make it a function.
link |
But in some sense, that's a bit of gratuitous change
link |
And you could argue, and people have,
link |
but one of the challenges was there wasn't enough features
link |
and too many just changes without features.
link |
And so the empathy for the end user
link |
as to why they would switch wasn't there.
link |
I think also it illustrated just the funding realities.
link |
Like Python wasn't funded.
link |
Like it was also a project
link |
with a bunch of volunteer labor, right?
link |
It had more people, so more volunteer labor,
link |
but it was still, it was fun in the sense
link |
that at least Guido had a job.
link |
And I've learned some of the behind the scenes on that now
link |
since talking to people who have lived through it
link |
and maybe not on air, we can talk about some of that.
link |
But it's interesting to see, but Guido had a job,
link |
but his full time job wasn't just work on Python.
link |
Like he had other things to do.
link |
It is wild, isn't it?
link |
It's wild how few people are funded.
link |
And how much impact they have.
link |
Maybe that's a feature not a bug, I don't know.
link |
Maybe, yes, exactly.
link |
At least early on, like it's sort of, I know, yeah.
link |
It's like Olympic athletes are often severely underfunded,
link |
but maybe that's what brings out the greatness.
link |
Perhaps, yes, correct.
link |
Maybe this is the essential part of it.
link |
Because I do think about that in terms of,
link |
I currently have an incubator for open source startups.
link |
Like what I'm trying to do right now
link |
is create the environment I wished had existed
link |
when I was leaving academia with NumPy
link |
and trying to figure out what to do.
link |
I'm trying to create those opportunities and environments.
link |
So, and that's what drives me still,
link |
is how do I make the world easier
link |
for the open source entrepreneur?
link |
So let me stay, I mean, I could probably stay on NumPy
link |
for a long time, but this is fun question.
link |
So Andre Kapathy leads the Tesla Autopilot team,
link |
and he's also one of the most like legit programmers I know.
link |
It's like he builds stuff from scratch a lot,
link |
and that's how he builds intuition about how a problem works.
link |
He just builds it from scratch, and I always love that.
link |
And the primary language he uses is Python
link |
for the intuition building.
link |
But he posted something on Twitter saying
link |
that they got a significant improvement
link |
on some aspect of their like data loading, I think,
link |
by switching away from np.square root,
link |
so the NumPy's implementation of square root,
link |
to math.square root, and then somebody else commented
link |
that you can get even a much greater improvement
link |
by using the vanilla Python square root, which is like.
link |
And it's fascinating to me, I just wanted to.
link |
So that was some shade throwing at some.
link |
No, no, and yes, we're talking about.
link |
It's a good way to ask the trade off
link |
between usability and efficiency broadly in NumPy,
link |
but also on these specific weird quirks
link |
of like a single function.
link |
Yep, so on that point, if you use a NumPy math function
link |
on a scaler, it's gonna be slower
link |
than using a Python function on that scaler.
link |
But because the math object in NumPy is more complicated,
link |
because you can also call that math object on an array.
link |
And so effectively, it goes through a similar machine.
link |
There aren't enough of the, which you would do
link |
and you could do like checks and fast paths.
link |
So yeah, if you're basically doing a list,
link |
if you run over a list, in fact,
link |
for problems that are less than 1,000,
link |
even maybe 10,000 is probably the,
link |
if you're going more than 10,000,
link |
that's where you definitely need to be using arrays.
link |
But if you're less than that, and for reading,
link |
if you're doing a reading process
link |
and essentially it's not compute bound, it's IO bound.
link |
And so you're really taking lists of 1,000 at a time
link |
and doing work on it.
link |
Yeah, you could be faster just using Python,
link |
straight up Python.
link |
See, but also, and this is the side to the top,
link |
there's the fundamental questions
link |
when you look at the long arc of history,
link |
it's very possible that np.square root is much faster.
link |
So like in terms of like, don't worry about it,
link |
it's the evils of over optimization or whatever,
link |
all the different quotes around that,
link |
is sometimes obsessing about this particular little quark
link |
is not sufficient.
link |
For somebody like, if you're trying to optimize your path,
link |
I mean, I agree, premature optimization
link |
creates all kinds of challenges, right?
link |
Because now, but you may have to do it.
link |
I believe the quote is, it's the root of all evil.
link |
It's the root of all evil, right?
link |
Let's give Donald Knuth, I think,
link |
or is he more than somebody else?
link |
Well, Doc Knuth is kind of like Mark Twain,
link |
people just attribute stuff to him, I don't know.
link |
And it's fine because he's brilliant.
link |
So, no, I was a LaTeX user myself,
link |
and so I have a lot of respect,
link |
and he did more than that, of course,
link |
but yeah, someone I really appreciate
link |
in the computer science space.
link |
Yeah, I don't, I think that's appropriate.
link |
There's a lot of little things like that,
link |
where people actually, if you understood it,
link |
you go, yeah, of course, that's the case.
link |
And the other part, the other part I didn't mention,
link |
and Numba was a thing we wrote early on,
link |
and I was really excited by Numba
link |
because it's something we wanted,
link |
it was a compiler for Python syntax,
link |
and I wanted it from the beginning of writing NumPy
link |
because of this function question,
link |
like taking, the power of arrays
link |
is really that you can write functions using all of it.
link |
It has implicit looping, right?
link |
So you don't worry about,
link |
I write this n dimensional for loop
link |
with four loops, four, four statements.
link |
You just say, oh, big four dimensional array,
link |
I'm gonna do this operation, this plus, this minus,
link |
this reduction, and you get this,
link |
it's called vectorization in other areas,
link |
but you can basically think at a high level
link |
and get massive amounts of computation done
link |
with the added benefit of,
link |
oh, it can be paralyzed easily.
link |
It can be put in parallel.
link |
You don't have to think about that.
link |
In fact, it's worse to go decompose your,
link |
you write the for loops
link |
and then try to infer parallelism from for loops.
link |
That's actually a harder problem
link |
than to take the array problem
link |
and just automatically parallelize that problem.
link |
That's what, and so functions in NumPy
link |
are called universal functions, ufuncs.
link |
So square root is an example of a ufunk.
link |
There are others, sine, cosine, add, subtract.
link |
In fact, one of the first libraries to SciPy
link |
was something called Special
link |
where I added Bessel functions
link |
and all these special functions that come up in physics
link |
and I added them as ufuncs so they could work on arrays.
link |
So I understood ufuncs very, very well
link |
from day one inside of numeric.
link |
That was one of the things we tried to make better
link |
in NumPy was how do they work?
link |
Can they do broadcasting?
link |
What does broadcasting mean?
link |
But one of the problems is, okay,
link |
what do I do with a Python scaler?
link |
So what happens, the Python scaler gets broadcast
link |
to a zero dimensional array
link |
and then it goes through the whole same machinery
link |
as if it were a 10,000 dimensional array.
link |
And then it kind of unpacks the element
link |
and then does the addition.
link |
That's not to mention the function it calls
link |
in the case of square root
link |
is just the clib square root, right?
link |
In some cases, like Python's power,
link |
there's some optimizations they're doing
link |
that could be faster
link |
than just calling this the clib square root.
link |
In the interpreter or in the?
link |
No, in the C code, in the Python runtime.
link |
In the Python runtime, so they really optimize it
link |
and they have the freedom to do that
link |
because they don't have to worry about.
link |
It's just a scaler.
link |
It's just a scaler.
link |
Right, they don't have to worry about the fact
link |
that, oh, this could be an object with many pieces.
link |
The ufunc machine is also generic
link |
in sense that typecasting and broadcasting,
link |
broadcasting's idea of I'm gonna go,
link |
I have a zero dimensional array,
link |
I have a scaler with a four dimensional array
link |
Oh, I have to kind of coerce the shape of this guy
link |
to make it work against the whole four dimensional array.
link |
So it's the idea of I can do a one dimensional array
link |
against a two dimensional array and have it make sense.
link |
Well, that's what NumPy does is it challenges you
link |
to reformulate, rethink your problem
link |
as a multi dimensional array problem
link |
versus move away from scalers completely.
link |
Right, exactly, exactly.
link |
In fact, that's where some of the edge cases boundaries are
link |
is that, well, they're still there
link |
and this is where array scalers are particular.
link |
So array scalers are particularly bad
link |
in the sense that they were written
link |
so that you could optimize the math on them,
link |
but that hasn't happened.
link |
And so their default is to coerce the array scaler
link |
to a zero dimensional array
link |
and then use the NumPy machinery.
link |
That's what, and you could specialize,
link |
but it doesn't happen all the time.
link |
So in fact, when we first wrote Numba,
link |
we do comparisons and say, look, it's 1000X speed up.
link |
We were lying a little bit in the sense that,
link |
well, first do the 40X slowdown
link |
of using the array scalers inside of a loop.
link |
Cause if you used to use Python scalers,
link |
you'd already be 10 times faster.
link |
But then we would get a hundred times faster
link |
over that using just compilation.
link |
But what we do is compile the loop
link |
from out of the interpreter to machine code.
link |
And then that's always been the power of Python
link |
is this extensibility so that you can,
link |
cause people say, oh, Python's so slow.
link |
Well, sure, if you do all your logic
link |
in the runtime of the Python interpreter, yeah.
link |
But the power is that you don't have to.
link |
You write all the logic,
link |
what you do in the high level is just high level logic.
link |
And the actual calls you're making
link |
could be on gigabyte arrays of data.
link |
And that's all done at compiled speeds.
link |
And the fact that integration is one can happen,
link |
but two is separable.
link |
That's one of the, the language like Julia says,
link |
we're going to be all in one.
link |
You can do all of it together.
link |
And then there's, the jury's out, is that possible?
link |
I tend to think that you're going to,
link |
there's separate concerns there.
link |
You want to precompile.
link |
In fact, generally you will want to precompile your,
link |
some of your loops.
link |
Like SciPy is a compilation step.
link |
To install SciPy, it takes about two hours.
link |
If you have many machines,
link |
maybe you can get it down to one hour.
link |
But to compile those libraries takes about, takes a while.
link |
You don't want to do that at runtime.
link |
You don't want to do that all the time.
link |
You want to have this precompiled binary available
link |
that you're then just linking into.
link |
So there's real questions about the whole source code.
link |
Code is, running binary code is more than source code.
link |
It's creating object code, it's the linker, it's the loader,
link |
it's the how does that interpret it
link |
inside of virtual memory space.
link |
There's a lot of details there that actually
link |
I didn't understand for a long time
link |
until I read books on the topic.
link |
And it led to, the more you know, the better off you are
link |
and you can do more details,
link |
but sometimes it helps with abstractions too.
link |
Well, the problem, as we mentioned earlier
link |
with abstractions is you kind of sometimes assume
link |
that whoever implemented this thing
link |
had your case in mind and found the optimal solution.
link |
Or like you assume certain things.
link |
I mean, there's a lot of,
link |
One of the really powerful things to me early on,
link |
I mean, it sounds silly to say, but with Python,
link |
probably one of the reasons I fell in love with it
link |
So obviously probably most languages
link |
have some mapping concept,
link |
but it felt like it was a first class citizen
link |
and it was just my brain was able to think in dictionaries.
link |
But then there's the thing that I guess I still use
link |
to this day is order dictionaries
link |
because that seems like a more natural way
link |
to construct dictionaries.
link |
And from a computer science perspective,
link |
the running time cost is not that significant,
link |
but there's a lot of things to understand about dictionaries
link |
that the abstraction kind of
link |
doesn't necessarily incentivize you to understand.
link |
Right, do you really understand the notion of a hash map
link |
and how the dictionary is implemented?
link |
Dictionaries are a good example
link |
of an abstraction that's powerful.
link |
And I agree with you.
link |
I agree, I love dictionaries too.
link |
Took me a while to understand that once you do,
link |
you realize, oh, they're everywhere.
link |
And Python uses them everywhere too.
link |
Like it's actually constructed,
link |
one of the foundational things is dictionaries
link |
and it does everything with dictionaries.
link |
So it is, it's powerful.
link |
Order dictionaries came later,
link |
but it is very, very powerful.
link |
It took me a little while coming
link |
from just the array programming entirely
link |
to understand these other objects,
link |
like dictionaries and lists and tuples and binary trees.
link |
Like I said, I wasn't a computer scientist,
link |
I studied arrays first.
link |
And so I was very array centric.
link |
And you realize, oh, these others
link |
don't have purposes and value actually.
link |
There's a friendliness about,
link |
like one way to think about arrays
link |
is arrays are just like full of numbers,
link |
but to make them accessible to humans
link |
and make them less error prone to human users,
link |
sometimes you want to attach names,
link |
human interpretable names
link |
that are sticky to those arrays.
link |
So that's how you start to think about dictionaries
link |
is you start to convert numbers
link |
into something that's human interpretable.
link |
And that's actually the tension I've had with NumPy
link |
because I've built so much tooling
link |
around human interpretability
link |
and also protecting me from a year later
link |
not making the mistakes by being,
link |
I wanted to force myself to use English versus numbers.
link |
Yes, so there's a project called Labeled Arrays.
link |
Like very early it was recognized that,
link |
oh, we're indexing NumPy with just numbers,
link |
all the columns and particularly the dimensions.
link |
I mean, if you have an image,
link |
you don't necessarily need to label each column or row,
link |
but if you have a lot of images
link |
or you have another dimension,
link |
you'd at least like to label the dimension
link |
as this is X, this is Y, this is Z,
link |
or this is give us some human meaning
link |
or some domain specific meaning.
link |
That was one of the impetuses for Pandas actually
link |
was just, oh, we do need to label these things.
link |
And Label Array was an attempt to add
link |
that like a lighter weight version of that.
link |
And there's been, like, that's an example of something
link |
I think NumPy could add, could be added to NumPy,
link |
but one of the challenges again, how do you fund this?
link |
Like I said, one of the tragedies I think is that,
link |
so I never had the chance to,
link |
I was never paid to work on NumPy, right?
link |
So I've always just done it in my spare time,
link |
always taken from one thing,
link |
taken from another thing to do it.
link |
And at the time, I mean, today,
link |
it would be the wrong day and today,
link |
like paying me to work on NumPy now
link |
would not be a good use of effort,
link |
but we are finally at Quansight Labs,
link |
I'm actually paying people to work on NumPy and SciPy,
link |
which is I'm thrilled with, I'm excited by.
link |
I've wanted to do that.
link |
That's what I always wanted to do from day one.
link |
It just took me a while to figure out a mechanism to do that.
link |
Even like in the university setting,
link |
respecting that, like pushing students,
link |
young minds and young graduate students to contribute
link |
and then figuring out financial mechanisms
link |
that enable them to contribute
link |
and then sort of reward them
link |
for their innovative scientific journey,
link |
that would be nice.
link |
But then also just a better allocation of resources.
link |
It's 20 year anniversary since 9.11
link |
and I was just looking, we spent over $6 trillion
link |
in the Middle East after 9.11 in the various efforts there.
link |
And sort of to put politics and all that aside,
link |
it's just, you think about the education system,
link |
all the other ways we could have
link |
possibly allocated that money.
link |
To me, to take it back,
link |
the amount of impact you would have
link |
by allocating a little bit of money to the programmers
link |
that build the tools that run the world is fascinating.
link |
I don't know, I think, again,
link |
there is some aspect to being broke
link |
as somewhat of a feature, not a bug,
link |
that you make sure that you're valued.
link |
But you can still manage that.
link |
Right, no, I know.
link |
But I don't think that's a big part.
link |
So it's like, I think you can have enough money
link |
and actually be wealthy while maintaining your values.
link |
There's an old adage that nations that trade together
link |
don't go to war together.
link |
I've often thought about nations that code together.
link |
Yeah, code together.
link |
Because one of the things I love about open source
link |
is it's global, it's multinational.
link |
Like there aren't national boundaries.
link |
One of the challenges with business and open source
link |
is the fact that, well, business is national.
link |
Like businesses are entities
link |
that are recognized in legal jurisdictions, right?
link |
And have laws that are respected in those jurisdictions
link |
and hiring, and yet the open source ecosystem
link |
is not, it's not there.
link |
Like currently, one of the problems we're solving
link |
is hiring people all over the world, right?
link |
Because we, it's a global effort.
link |
And I've had the chance to work, and I've loved the chance.
link |
I've never been to like Iran,
link |
but I once had a conference
link |
where I was able to talk to people there, right?
link |
And talk to folks in Pakistan.
link |
I've never been there, but we had a call
link |
where there were people there,
link |
like just scientists and normal people.
link |
And there's a certain amount of humanizing, right?
link |
That gets away from the,
link |
like we often get the memes of society
link |
that bubble up and get discussed,
link |
but the memes are not even an accurate reflection
link |
of the reality of what people are.
link |
Well, if you look at the major power centers
link |
that are leading to something like cyber war
link |
in the next few decades,
link |
it's the United States, it's Russia, and China.
link |
And those three countries in particular
link |
have incredible developers.
link |
So if they work together, I think that's one way,
link |
the politicians can do their stupid bickering,
link |
but like there's a layer of infrastructure, of humanity.
link |
If they collaborate together,
link |
that I think can prevent major military conflict,
link |
which would, I think most likely happen at the cyber level
link |
versus the actual hot war level.
link |
You know, I think that's a good prediction.
link |
Nations that code together don't go to war together.
link |
Don't go to war together.