back to indexTravis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224
link |
The following is a conversation with Travis Oliphant,
link |
one of the most impactful programmers
link |
and data scientists ever.
link |
He created NumPy, SciPy, and Anaconda.
link |
NumPy formed the foundation
link |
of tensor based machine learning in Python,
link |
SciPy formed the foundation
link |
of scientific programming in Python,
link |
and Anaconda, specifically with Conda,
link |
made Python more accessible to a much larger audience.
link |
Travis's life work across a large number of programming
link |
and entrepreneurial efforts has and will continue
link |
to have immeasurable impact on millions of lives
link |
by empowering scientists and engineers
link |
in big companies, small companies,
link |
and open source communities to take on difficult problems
link |
and solve them with the power of programming.
link |
Plus, he's a truly kind human being,
link |
which is something that when combined with vision
link |
and ambition makes for a great leader
link |
and a great person to chat with.
link |
To support this podcast,
link |
please check out our sponsors in the description.
link |
This is the Lex Friedman Podcast,
link |
and here is my conversation with Travis Oliphant.
link |
What was the first computer program you've ever written?
link |
Whoa, that's a good question.
link |
I think it was in fourth grade.
link |
Just a simple loop in BASIC.
link |
BASIC. BASIC, yeah, on an Atari 800,
link |
Atari 400, I think, or maybe it was an Atari 800.
link |
It was a part of a class,
link |
and we just were just BASIC loops to print things out.
link |
Did you use go to statements?
link |
Yes, yes, we used go to statements.
link |
I remember in the early days,
link |
that's when I first realized
link |
there's like principles to programming,
link |
when I was told that don't use go to statements.
link |
Those are bad software engineering principles,
link |
like it goes against what great, beautiful code is.
link |
I was like, oh, okay, there's rules to this game.
link |
I didn't see that until high school
link |
when I took an AP computer science course.
link |
I did a lot of other kinds of just programming in TI,
link |
but finally, when I took an AP computer science course
link |
That's, yeah, it was Pascal.
link |
That's when I, oh, there are these principles.
link |
No, I didn't take C until the next year in college.
link |
I had a course in C, but I haven't done much in Pascal,
link |
just that AP computer science course.
link |
Now, sorry for the romanticized question,
link |
but when did you first fall in love with programming?
link |
Oh, man, good question.
link |
I think actually when I was 10,
link |
my dad got us a TI Timex Sinclair,
link |
and he was excited about the spreadsheet capability,
link |
and then, but I made him get the basic,
link |
the add ons we could actually program in basic,
link |
and just being able to write instructions
link |
and have the computer do something.
link |
Then we got a TI 994A when I was about 12,
link |
and I would just, it had sprites and graphics and music.
link |
You could actually program it to do music.
link |
That's when I really sort of fell in love with programming.
link |
So this is a full, like a real computer
link |
with like, with memory and storage,
link |
processors and whatnot,
link |
because you say TI. Yeah, the Timex Sinclair
link |
was one of the very first, it was a cheap, cheap,
link |
like, I think it was, well, it was still expensive,
link |
but it was 2K of memory.
link |
We got the 16K add on pack,
link |
but yeah, it had memory, and you could program it.
link |
You had the, in order to store your programs,
link |
you had to attach a tape drive.
link |
Remember that old, the sound that would play
link |
when you converted the modems would convert digital bits
link |
to audio files set on a tape drive.
link |
Still remember that sound, but that was the storage.
link |
And what was the programming language, do you remember?
link |
It was basic. It was basic.
link |
And then they had a VisiCalc,
link |
and so a little bit of spreadsheet programming
link |
in VisiCalc, but mostly just some basic.
link |
Do you remember what kind of things drew you to programming?
link |
Was it working with data, was it video games?
link |
Games, math, mathy stuff?
link |
Yeah, I've always loved math,
link |
and a lot of people think they don't like math
link |
because I think when they're exposed to it early,
link |
it's about memory.
link |
When you're exposed to math early,
link |
you have a good short term memory,
link |
can remember his timetables.
link |
And I do have a reasonably, I mean, not perfect,
link |
but a reasonably long little short term memory buffer.
link |
And so I did great at timetables.
link |
I said, oh, I'm good at math.
link |
But I started to really like math,
link |
just the problem solving aspect.
link |
And so computing was problem solving applied.
link |
And so that's always kind of been the draw,
link |
kind of coupled with the mathematics.
link |
Did you ever see the computer as like an extension
link |
of your mind, like something able to achieve?
link |
It's just like a little set of puzzles
link |
that you can play with and you can play with math puzzles.
link |
Yeah, it was too rudimentary early on.
link |
Like it was sort of, yeah, it was a lot of work
link |
to actually take a thought you'd have
link |
and actually get it implemented.
link |
And that's still work, but it's getting easier.
link |
And so yeah, I would say that's definitely
link |
what's attracting me to Python
link |
is that that was more real, right?
link |
I could think in Python.
link |
Speaking of foreign language,
link |
I only speak another language fluently besides English,
link |
And I remember the day when I would dream in Spanish
link |
and you start to think in that language.
link |
And then you actually, I do definitely believe
link |
that language limits or expands your thinking.
link |
There's some languages that actually lead you
link |
to certain thought processes.
link |
Yeah, like, so I speak Russian fluently
link |
and that's certainly a language that leads you
link |
down certain thought processes.
link |
Well, yeah, I mean, there's a history
link |
of the two world wars of millions of people starving
link |
to death or near to death throughout its history
link |
of suffering, of injustice, like this promise sold
link |
to the people and then the carpet
link |
or whatever is swept from under them.
link |
And it's like broken promises.
link |
And all of that pain and melancholy is in the language,
link |
the sad songs, the sad hopeful songs,
link |
the over romanticized, like, I love you, I hate you,
link |
the sort of the swings between all the various spectrums
link |
of emotion, so that's all within the language.
link |
The way it's twisted, there's a strong culture
link |
of rhyming poetry, so like the bards,
link |
like the sync, there's a musicality to the language too.
link |
Did Dostoevsky write in Russian?
link |
Yeah, so like Dostoevsky, Tostoy, all the,
link |
The ones that I know about, which are translated
link |
and I'm curious how the translations.
link |
So Dostoevsky did not use the musicality
link |
of the language too much.
link |
So it actually translates pretty well
link |
because it's so philosophically dense
link |
that the story does a lot of the work,
link |
but there's a bunch of things that are untranslatable.
link |
Certainly the poetry is not translatable.
link |
I actually have a few conversations coming up offline
link |
and also in this podcast with people
link |
who've translated Dostoevsky.
link |
And that's for people who worked, who work in this field,
link |
know how difficult that is.
link |
Sometimes you can spend months thinking
link |
about a single sentence, right?
link |
In context, like, cause there's just the magic
link |
captured by that sentence and how do you translate
link |
just in the right way?
link |
Because those words can be really powerful.
link |
There's a famous line,
link |
beauty will save the world from Dostoevsky.
link |
You know, there's so many ways to translate that.
link |
And you're right, the language gives you the tools
link |
with which to tell the story,
link |
but it also leads your mind down certain trajectories
link |
and paths to where over time,
link |
as you think in that language,
link |
you become a different human being.
link |
Yeah, that's a fascinating reality, I think.
link |
I know people have explored that,
link |
but it's just rediscovered.
link |
Well, we don't, we live in our own like little pockets.
link |
Like this is the sad thing is I feel like unfortunately,
link |
given time and given getting older,
link |
I'll never know China, the Chinese world,
link |
because I don't truly know the language.
link |
Same with Japanese, I don't truly know Japanese
link |
and Portuguese and Brazil,
link |
that whole South American continent.
link |
Like, yeah, I'll go to Brazil and Argentina,
link |
but will I truly understand the people
link |
if I don't understand the language?
link |
It's sad because I wonder how much,
link |
how many geniuses were missing
link |
because so much of the scientific world,
link |
so much of the technical world is in English,
link |
and so much of it might be lost
link |
because it's just we don't have the common language.
link |
I completely agree.
link |
I'm very much in that vein of there's a lot of genius
link |
out there that we miss,
link |
and it's sort of fortunate when it bubbles up
link |
into something that we can understand or process,
link |
there's a lot we miss.
link |
So I tend to lean towards really loving democratization
link |
or things that empower people
link |
or very resistant sort of authoritarian structures.
link |
Fundamentally for that reason,
link |
well, several reasons, but it just hurts us.
link |
So speaking of languages that empower you,
link |
so Python was the first language for me
link |
that I really enjoyed thinking in, as you said.
link |
Sounds like you shared my experience too.
link |
So when did you first,
link |
do you remember when you first kind of connected with Python,
link |
maybe even fell in love with Python?
link |
It's a good question.
link |
It took about a year.
link |
I first encountered Python in 1997.
link |
I was a graduate student studying biomedical engineering
link |
at the Mayo Clinic.
link |
And I had previously,
link |
I'd been involved in taking information from satellites.
link |
I was an electrical engineering student
link |
used to taking information
link |
and trying to get something out of it,
link |
doing some data processing, getting information out of it.
link |
And I'd done that in MATLAB.
link |
I'd done that in Perl.
link |
I'd done that in scripting on a VMS.
link |
There's actually a VAX VMS system,
link |
they had their own little scripting tools around Fortran.
link |
Done a lot of that.
link |
And then as a graduate student,
link |
I was looking for something and encountered Python.
link |
And because Python had an array,
link |
had two things that made me not filter it away.
link |
Because I was filtering a bunch of stuff,
link |
as Yorick, I looked at Yorick,
link |
I looked at a few other languages that are out there
link |
at the time in 1997, but it had arrays.
link |
There's a library called Numeric
link |
that had just been written in 95,
link |
like not very, not too much earlier.
link |
By an MIT alum, Jim Huganen.
link |
You know, and I went back and read the mailing list
link |
to see the history of how it grew.
link |
And there was a very interesting,
link |
it's fascinating to do that actually,
link |
to see how this emergent cooperation,
link |
unstructured cooperation happens in the open source world
link |
that led to a lot of this collective programming,
link |
which is something maybe we might get into a little later,
link |
but what that looks like.
link |
What gap did Numeric fill?
link |
Numeric filled the gap of having an array object.
link |
There was no array object.
link |
There was no array.
link |
There was a one dimensional byte concept,
link |
but there was no n dimensional,
link |
two, three, four dimensional tensor they call it now.
link |
I'm still in the category that a tensor is another thing
link |
and it's just an ndarray we should call it,
link |
but kind of lost that battle.
link |
There's many battles in this world,
link |
some of which we win, some we lose.
link |
That's exactly right.
link |
So, but it had no math to it.
link |
So Numeric had math and a basic way to think in arrays.
link |
So I was looking for that,
link |
and it had complex numbers,
link |
a lot of programming languages.
link |
And you can see it because,
link |
if you're just a computer scientist,
link |
you think, ah, complex numbers are just two floats.
link |
So you can, people can build that on.
link |
But in practice, a complex number
link |
as one of the significant algebras
link |
that helps connect a lot of physical
link |
and mathematical ideas,
link |
particularly FFT for an electrical engineer.
link |
And it's a really important concept
link |
and not having it means you have to develop it
link |
several times and those times may not share an approach.
link |
One of the common things in programming,
link |
one of the things programming enables is abstractions.
link |
But when you have shared abstractions, it's even better.
link |
It sort of gets to the level of language
link |
of actually we all think of this the same way,
link |
which is both powerful and dangerous, right?
link |
Because powerful in that we now can quickly
link |
make bigger and higher level things
link |
on top of those abstractions dangerous
link |
because it also limits us as to the things
link |
we maybe left behind in producing that abstraction,
link |
which is at the heart of programming today
link |
and actually building around the programming world.
link |
I think it's a fascinating philosophical topic.
link |
Yeah, they will continue for many years, I think.
link |
They'll continue for many years.
link |
As we build more and more and more abstractions.
link |
Yes, I often think about, you know,
link |
we have a world that's built on these abstractions
link |
that were they the only ones possible?
link |
Certainly not, but they led to,
link |
you know, it's very hard to do it differently.
link |
Like there's an inertia that's very hard to,
link |
you know, push out, push away from.
link |
That has implications for things like,
link |
you know, the Julia language,
link |
which you have heard of, I'm sure.
link |
And I've met the creators and I liked Julia.
link |
It's a really cool language,
link |
but they struggled to kind of against the,
link |
just the tide of like this inertia of people using Python.
link |
And, you know, there's strategies to approach that,
link |
but nonetheless, it's a phenomena.
link |
And sometimes, so I love complex numbers
link |
and I love to raise, so I looked at Python.
link |
And then I had the experience, I did some stuff in Python
link |
and I was just doing my PhD.
link |
So I was out, my focus was on,
link |
I was actually doing a combination of MRI and ultrasound
link |
and looking at a phenomenon called elastography,
link |
which is you push waves into the body
link |
and observe those waves, like you can actually measure them.
link |
And then you do mathematical inversion
link |
to see what the elasticity is.
link |
And so that's the problem I was solving
link |
is how to do that with both ultrasound and MRI.
link |
I needed some tool to do that with.
link |
So I was starting to use Python in 97.
link |
In 98, I went back, looked at what I'd written
link |
and realized I could still understand it,
link |
which is not the experience I'd had
link |
when doing Perl in 95, right?
link |
I'd done the same thing and then I looked back
link |
and I forgotten what I was even saying.
link |
Now, you know, I'm not saying, so that may,
link |
hey, this may work, I like this.
link |
This is something I can retain
link |
without becoming an expert per se.
link |
And so that led me to go, I'm gonna push more into this.
link |
And then that 98 was kind of when I started
link |
to fall in love with Python, I would say.
link |
A few peculiar things about Python.
link |
So maybe compare it to Perl,
link |
compare it to some of the other languages.
link |
So there's no braces.
link |
So space is used, indentation, I should say,
link |
is used as part of the language.
link |
So did you, I mean, that's quite a leap.
link |
Were you comfortable with that leap
link |
or were you just very open minded?
link |
It's a good question.
link |
I was open minded, so I was cognizant of the concern.
link |
And it definitely has, it has specific challenges.
link |
You know, cut and pasting.
link |
For example, when you're cut and pasting code,
link |
and if your editors aren't supportive of that,
link |
if you're putting it into a terminal,
link |
and particularly in the past when terminals
link |
didn't necessarily have the intelligence to manage it now.
link |
Now, I, Python, and Jupyter Notebooks
link |
handle that just fine, so there's really no problem.
link |
But in the past, it created some challenges,
link |
formatting challenges, also mixed tabs and spaces.
link |
If editors weren't, you weren't clear
link |
on what was happening, you would have these issues.
link |
So there were really concrete reasons about it
link |
that I heard and understood.
link |
I never really encountered a problem with it personally.
link |
Like, it was occasional annoyances,
link |
but I really liked the fact
link |
that it didn't have all this extra characters, right?
link |
That these extra characters didn't show up
link |
in my visual field when I was just trying
link |
to process understanding a snippet of code.
link |
Yeah, there's a cleanness to it.
link |
But, I mean, the idea is supposed to be
link |
that Perl also has a cleanness to it
link |
because of the minimalism of how many characters
link |
it takes to express a certain thing.
link |
So it's very compact.
link |
But what you realize with that compactness comes,
link |
there's a culture that prizes compactness,
link |
and so the code gets more and more compact
link |
and less and less readable to a point where it's like,
link |
like, to be a good programmer in Perl,
link |
you write code that's basically unreadable.
link |
There's a culture, like.
link |
Correct, and you're proud of it.
link |
Yeah, you're proud of it.
link |
Right, exactly, and it's like, feels good.
link |
And it's really selective.
link |
It means you have to be an expert in Perl to understand it.
link |
Whereas Python allowed you not to have to be an expert.
link |
You didn't have to take all this brain energy.
link |
You could leverage, what I say,
link |
you could leverage your English language center,
link |
which you're using all the time.
link |
I've wondered about other languages,
link |
particularly non Latin based languages.
link |
Latin based languages with the characters are at least similar.
link |
I think people have an easier time,
link |
but I don't know what it's like to be a Japanese
link |
or a Chinese person trying to learn different syntax.
link |
Like, what would computer programming look like in that?
link |
I haven't looked at that at all,
link |
but it certainly doesn't,
link |
you know, leveraging your Chinese language center,
link |
I'm not sure Python or any programming does that.
link |
But that was a big deal.
link |
The fact that it was accessible, I could be a scientist.
link |
What I really liked is many programming languages
link |
really demand a lot of you, and you can get a lot,
link |
you know, you do a lot if you learn it.
link |
But Python enables you to do a lot
link |
without demanding a lot of you.
link |
There's nuance to that statement,
link |
but it certainly was, it's more accessible.
link |
So more people could actually, as a scientist,
link |
as somebody who, or an engineer,
link |
who was trying to solve another problem
link |
besides point programming,
link |
I could still use this language and get things done
link |
and be happy about it.
link |
And I was also comfortable in C at that time.
link |
And MATLAB, you did a little bit of that.
link |
And MATLAB, I did a lot before that, exactly.
link |
So I was comfortable in,
link |
those three languages were really the tools I used
link |
during my studies and schooling.
link |
But to your point about language helping you think,
link |
one of the big things about MATLAB was it was,
link |
and APL before it, I don't know if you remember APL.
link |
APL is actually the predecessor of array based programming,
link |
which I think is really an underappreciated,
link |
if I talk to people who are just steeped
link |
in computer programming, computer science,
link |
like most of the people that Microsoft has hired
link |
in the past, for example,
link |
Microsoft as a company generally did not understand
link |
array based programming.
link |
Like culturally, they didn't understand it.
link |
So they kept missing the boat,
link |
kept missing the understanding of what this was.
link |
They've gotten better,
link |
but there's still a whole culture of folks
link |
that doesn't, programming, that's systems programming
link |
or web programming or lists and maps.
link |
And what about an n dimensional array?
link |
Oh yeah, that's just an implementation detail.
link |
Well, you can think that,
link |
but then actually if you have that as a construct,
link |
you actually think differently.
link |
APL was the first language to understand that.
link |
And it was in the sixties, right?
link |
The challenge of APL is APL had very dense,
link |
not only glyphs, like new characters, new glyphs,
link |
but they even had a new keyboard
link |
because to produce those glyphs,
link |
this is back in the early days in computing
link |
when the QWERTY keyboard maybe wasn't as established,
link |
like, well, we can have a new keyboard, no big deal.
link |
But it was a big deal and it didn't catch on.
link |
And the language APL, very much like Perl,
link |
as people would pride themselves on how much,
link |
could they write the game of life
link |
in 30 characters of APL.
link |
APL has characters that mean summation
link |
and they have adverbs,
link |
they would have adjectives and these things called adverbs,
link |
which are like methods, like reduction,
link |
reduction would be an adverb on an ad operator, right?
link |
So, but doing, using these tools you could construct
link |
and then you start to think at that level,
link |
you think in n dimensions is something I like to say,
link |
and you start to think differently about data at that point.
link |
Now you're, it really helps.
link |
Yeah, I mean, outside of programming,
link |
if you really internalize linear algebra as a course,
link |
I mean, it's philosophically allows you
link |
to think of the world differently.
link |
It's almost like liberating, you don't have to,
link |
you don't have to think about the individual numbers
link |
in the n dimensional array.
link |
You could think of it as an object in itself
link |
and all of a sudden this world can open up.
link |
You're saying MATLAB and APL were like the early C,
link |
I don't know if many languages got that right ever.
link |
No, no, no they didn't.
link |
Even still, I would say.
link |
I mean, NumPy is an inheritor of the traditions
link |
that I would say APLJ was another version that was,
link |
what it did is not have the glyphs,
link |
just have short characters,
link |
but still a Latin keyboard could type them.
link |
And then numeric inherited from that
link |
in terms of let's add arrays plus broadcasting
link |
plus methods, reduction,
link |
even some of the language like rank is a concept
link |
that was in Python and is still in Python
link |
for the number of dimensions, right?
link |
That's different than say the rank of a matrix
link |
which people think of as well.
link |
So it came from that tradition,
link |
but NumPy is a very pragmatic, practical tool.
link |
NumPy inherited from numeric
link |
and we can get to where NumPy came from
link |
which is the current array,
link |
at least current as of 2015, 2017.
link |
Now there's a ton of them over the past two or three years.
link |
We can get into that too.
link |
So if we just linger on the early days
link |
of what was your favorite feature of Python?
link |
Do you remember like what?
link |
So it's so interesting to linger on like the,
link |
what really makes you connect with a language?
link |
I'm not sure it's obvious to introspect that.
link |
And I've thought about that at some length.
link |
I think definitely the fact that I could read it later,
link |
that I could use it productively
link |
without becoming an expert.
link |
Other language I had to put more effort into.
link |
That's like an empirical observation.
link |
Like you're not analyzing any one aspect of the language.
link |
It just seems time after time when you look back,
link |
it's somehow readable.
link |
It's somehow readable.
link |
Then it was sort of, I could take executable English
link |
and translate it to Python more easily.
link |
Like I didn't have to go, there was no translation layer.
link |
As an engineer or as a scientist,
link |
I could think about what I wanted to do.
link |
And then the syntax wasn't that far behind it, right?
link |
Now there are some warts there still.
link |
It wasn't perfect.
link |
Like there's some areas where I'm like,
link |
ah, it'd be better if this were different
link |
or if this were different.
link |
Some of those things got added to the language too.
link |
I was really grateful for some of the early pioneers
link |
in the Python ecosystem back,
link |
because Python got written in 91.
link |
That's when the first version came out.
link |
But Guido was very open to users.
link |
And one of the sets of users were people like Jim Huganen
link |
and David Asher and Paul Dubois and Conrad Hinson.
link |
These were people that were on the main list.
link |
And they were just asking for things like,
link |
hey, we really should have complex numbers in this language.
link |
So let's, you know, there's a J, there's a one J, right?
link |
And the fact that they went the engineering route of J
link |
I don't think that's entirely favoring engineers.
link |
I think it's because I is so often used
link |
as the index of a for loop.
link |
So I think that's actually why.
link |
Probably, I mean, there's a pragmatic aspect.
link |
But the fact that complex numbers were there, I love that.
link |
The fact that I could write in the array constructs
link |
and that reduction was there,
link |
very simple to write summations and broadcasting was there.
link |
I could do addition of whole arrays.
link |
Those are some things I loved about it.
link |
I don't know what to start talking to you about
link |
because you've created so many incredible projects
link |
that basically changed the whole landscape of programming.
link |
But okay, let's start with,
link |
let's go chronologically with SciPy.
link |
You created SciPy over two decades ago now?
link |
Yes, yes, I love to talk about SciPy.
link |
SciPy was really my baby.
link |
What was its goal?
link |
So SciPy was effectively, here I am using Python
link |
to do stuff that I previously used MATLAB to use.
link |
And I was using numeric, which is an array library
link |
that made a lot of it possible.
link |
But there's things that were missing.
link |
Like I didn't have an ordinary differential equation solver
link |
I could just call, right?
link |
I didn't have integration.
link |
Hey, I wanted to integrate this function.
link |
Okay, well, I don't have just a function
link |
I can call to do that.
link |
These are things I remember being critical things
link |
that I was missing.
link |
I just wanna pass a function to an optimizer
link |
and have it tell me what the optimal value is.
link |
Those are things I'm like, well,
link |
why don't we just write a library that adds these tools?
link |
And I started to post on the mailing list
link |
and there'd previously been, people have discussed,
link |
I remember Conrad Henson saying,
link |
wouldn't it be great if we had this optimizer library
link |
or David Ashwood say this stuff.
link |
And I'm a ambitious, ambitious is the wrong word,
link |
an eager and probably more time than sense.
link |
I was a poor graduate student.
link |
My wife thinks I'm working on my PhD and I am,
link |
but part of the PhD that I loved
link |
was the fact that it's exploratory.
link |
You're not just taking orders,
link |
fulfilling a list of things to do,
link |
you're trying to figure out what to do.
link |
And so I thought, well, I'm running tools
link |
for my own use and a PhD,
link |
so I'll just start this project.
link |
And so in 99, 98 was when I first started
link |
to write libraries for Python.
link |
Definitely when I fell in love with Python 98,
link |
I thought, oh, well, there's just a few things missing.
link |
Like, oh, I need a reader to read DICOM files.
link |
I was in medical imaging and DICOM was a format
link |
that I want to be able to load that into Python.
link |
Okay, how do I write a reader for that?
link |
So I wrote something called, it was an IO package, right?
link |
And that was my very first extension module, which is C.
link |
So I wrote C code to extend Python
link |
so that in Python I could write things more easily.
link |
That combination kind of hooked me.
link |
It was the idea that I could,
link |
here's this powerful tool I can use as a scripting language
link |
and a high level language to think about,
link |
but that I can extend easily, easily in C,
link |
easily for me because I knew enough C.
link |
And then Guido had written a link.
link |
I mean, the only, the hard part of extending Python
link |
was something called the way memory management networks,
link |
and you have to do reference counting.
link |
And so there's a tracking of reference counting
link |
you have to do manually.
link |
And if you don't, you have memory leaks.
link |
And so that's hard.
link |
Plus then C, you know, it's just much more,
link |
you have to put more effort into it.
link |
It's not just, I have to now think about pointers
link |
and I have to think about stuff that is different.
link |
I have to kind of,
link |
you're like putting a new cartridge in your brain.
link |
Like, okay, I'm thinking about MRI.
link |
Now I'm thinking about programming.
link |
And there are distinct modules
link |
you end up having to think about.
link |
And when I was just in Python,
link |
I could just think about MRI and high level writing,
link |
but I could do that.
link |
And that kind of, I liked it.
link |
I found that to be enjoyable and fun.
link |
And so I ended up, oh,
link |
well, let me just add a bunch of stuff to Python
link |
to do integration.
link |
Well, and the cool thing is,
link |
is that the power of the internet,
link |
just looking around and I found,
link |
oh, there's this NetLive,
link |
which has hundreds of 4chan routines
link |
that people have written in the 60s and the 70s and the 80s
link |
in 4chan 77, fortunately, it wasn't 4chan 16.
link |
So it had been ported to 4chan 77.
link |
And 4chan 77 is actually a really great language.
link |
4chan 90 probably is my favorite 4chan
link |
because it's also, it's got complex numbers,
link |
got arrays and it's pretty high level.
link |
Now, the problem with it
link |
is you'd never want to write a program in 4chan 90
link |
but it's totally fine to write a subroutine in, right?
link |
And so, and then 4chan kind of got a little off course
link |
when they tried to compete with C++.
link |
I just want libraries to do something like,
link |
oh, here's an ordinary differential equation.
link |
Here's integration.
link |
Here's runge cut integration.
link |
I don't have to think about that algorithm.
link |
I mean, you could,
link |
but it's nice to have somebody who's already done one
link |
And so I sort of started this journey in 98, really.
link |
If you look back at the mailing list,
link |
there's sort of this productive era of me
link |
writing an extension module
link |
to connect runge cut integration to Python
link |
and making an ordinary differential equation solver.
link |
And then releasing that as a package.
link |
So we could call ODE pack, I think I called it then.
link |
And then I just made these packages.
link |
Eventually that became multipack
link |
because they're originally modular.
link |
You can install them separately.
link |
But a massive problem in Python
link |
was actually just getting your stuff installed.
link |
At the time, releasing software for me,
link |
like today it's people think, what does that mean?
link |
Well, then it meant some poorly written webpage.
link |
I had some bad webpage up and I put a tarball,
link |
just a GZIP tarball of source code.
link |
That was the release.
link |
But okay, can we just stand that?
link |
Because the community aspect
link |
of creating the package and sharing that, that's rare.
link |
That, to have, to both have the, at that time,
link |
Yeah, it was pretty early, yeah.
link |
Oh, well, not rare.
link |
Maybe you can correct me on this,
link |
but it seems like in the scientific community,
link |
so many people, you were basically solving the problems
link |
you needed to solve to process the particular application,
link |
the data that you need.
link |
And to also have the mind
link |
that I'm going to make this usable for others, that's.
link |
I would say I was inspired.
link |
I'd been inspired by Linux,
link |
been inspired by Linus and him making his code available.
link |
And I was starting to use Linux at the time.
link |
And I went, this is cool.
link |
So I'd kind of been previously primed that way.
link |
And generally I was into science
link |
because I liked the sharing notion.
link |
I liked the idea of, hey, let's,
link |
if collectively we build knowledge and share it,
link |
we can all be better off.
link |
Okay, so you want to energize by that idea.
link |
So I was energized by that idea already, right?
link |
And I can't deny that I was.
link |
I'm sort of had this very,
link |
I liked that part of science, that part of sharing.
link |
And then all of a sudden, oh, wait, here's something.
link |
And here's something I could do.
link |
And then I slowly over years learned how to share better
link |
so that you could actually engage more people faster.
link |
One of the key things was actually giving people a binary
link |
they could install, right?
link |
So that it wasn't just your source code, good luck.
link |
Compile this and then.
link |
It's compiled, ready to install, just, you know.
link |
So in fact, a lot of the journey from 98,
link |
even through 2012 when I started Anaconda was about that.
link |
Like it's why, you know, it's really the key
link |
as to why a scientist with dreams of doing MRI research
link |
ended up starting a software company
link |
that installs software.
link |
I work with a few folks now that don't program
link |
like on the creative side and the video side,
link |
And because my whole life is running on scripts,
link |
I have to try to get them,
link |
I'm having all the task of teaching them
link |
how to do Python enough to run the scripts.
link |
And so I've been actually facing this,
link |
whether it's Anaconda or some with the task of
link |
how do I minimally explain basically to my mom
link |
how to write a Python script.
link |
And it's an interesting challenge.
link |
I have to, it's a to do item for me to figure out like,
link |
what is the minimal amount of information I have to teach?
link |
What are the tools you use that one, you enjoy it,
link |
two, you're effective at it.
link |
And they're related, those are two related questions.
link |
And then the debugging, like the iterative process
link |
of running the script to figure out what the error is,
link |
maybe even for some people to do the fix yourself.
link |
So do you compile it?
link |
Do you, like how do you distribute that code to them?
link |
And it's interesting because I think
link |
it's exactly what you're talking about.
link |
If you increase the circle of empathy,
link |
the circle of people that are able to use your programs,
link |
you increase it, it's like effectiveness and it's power.
link |
And so you have to think, can I write scripts?
link |
Can I write programs that can be used by medical engineers,
link |
by all kinds of people that don't know programming
link |
and actually maybe plant a seed,
link |
have them catch the bug of programming
link |
so that they start on a journey.
link |
That's a huge responsibility.
link |
And ultimately it has to do with the Amazon one click buy.
link |
Like how frictionless can you make the early steps?
link |
Frictionless is actually really key.
link |
To go in any community is, any friction point,
link |
you're just gonna lose some people, right?
link |
Now sometimes you may wanna intentionally do that.
link |
If you're early enough on, you need a lot of help.
link |
You need people who have the skills.
link |
You might actually, it's helpful.
link |
You don't necessarily have too many users
link |
as opposed to contributors if you're early on.
link |
Anyway, there's, SciFi started in 98,
link |
but it really emerged as this collection of modules
link |
that I was just putting on the net.
link |
People were downloading and I think I got 100 users, right?
link |
By the end of that year.
link |
But the fact that I got 100 users and more than that,
link |
people started to email me with fixes.
link |
And that was actually intoxicating, right?
link |
That was the, here I'm writing papers
link |
and I'm giving conferences and I get people to say hello,
link |
but yeah, good job.
link |
But mostly it was, you're viewed with,
link |
it's competitive, right?
link |
You publish a paper and people are like,
link |
oh, it wasn't my paper.
link |
I was starting to see that sense of academic life
link |
where it was so much,
link |
I thought there was this cooperative effort,
link |
but it sounds like we're here just to one up each other.
link |
And it's not true across the board,
link |
but a lot of that's there.
link |
But here in this world,
link |
I was getting responses from people all over the world.
link |
I remember Pjaro Peterson in Estonia, right?
link |
Was one of the first people.
link |
And he sent me back this make file,
link |
cause the first thing it is, yeah, your build thing stinks
link |
and here's a better make file.
link |
Now it was a complex make file.
link |
I don't think I never understood that make file actually,
link |
but it worked and it did a lot more.
link |
And so I said, thanks, this is cool.
link |
And that was my first kind of engagement
link |
with community development.
link |
But the process was, he sent me a patch file.
link |
I had to upload a new tar ball.
link |
And I just found, I really love that.
link |
And the style back then was here's a mailing list.
link |
It's very, it wasn't as,
link |
it's certainly weren't the tools that are available today.
link |
It was very early on, but I really started to,
link |
that's the whole year.
link |
I think I did about seven packages that year, right?
link |
And then by the end of the year,
link |
I collected them into a thing called multipack.
link |
So in 99, there was this thing called multipack.
link |
And that's when a high school student,
link |
no, he was a high school student at the time,
link |
guy named Robert Kern,
link |
took that package and made a Windows installer, right?
link |
And then of course, a massive increase of usage.
link |
So by the way, most of this development was under Linux.
link |
Yes, yes, it was on Linux.
link |
I was a Linux developer doing it on a Unix box.
link |
I mean, at the time I was actually getting into,
link |
I had a new hard drive,
link |
did some kernel programming to make the hard drive work.
link |
I mean, not programming, but modification to the kernel
link |
so I could actually get a hard drive working.
link |
I love that aspect of it.
link |
I was also in, at school, I was building a cluster.
link |
I took Mac computers and you put yellow dog Linux on them.
link |
At the Mayo Clinic, they were just,
link |
they had all these Macs that were older,
link |
they were just getting rid of.
link |
And so I kind of got permission to go grab them together.
link |
I put about 24 of them together in a cluster, in a cabinet,
link |
and put yellow dog Linux on them all.
link |
And I wrote a C++ program to do MRI simulation.
link |
That was what I was doing at the same time
link |
for my day job, so to speak.
link |
So I was loving the whole process.
link |
And the same time I was,
link |
oh, I need a ordinary differential equation.
link |
That's why ordinary differential equations were key
link |
was because that's the heart of a block equation
link |
for simulating MRI, is an ODE solver.
link |
And so that's, but I actually did that,
link |
it just happened at the same time.
link |
That's why it was kind of what you're working on
link |
and what you're interested in, they're coinciding.
link |
I was definitely scratching my own itch
link |
in terms of building stuff.
link |
And which helped in the sense that I was using it for me,
link |
so at least I had one user.
link |
I had one person who was like, well, no, this is better.
link |
I like this interface better.
link |
And I had the experience of MATLAB
link |
to guide some of what those APIs might look like.
link |
But you're just doing yourself,
link |
you're building all this stuff.
link |
But with the Windows installer,
link |
it was the first time I realized, oh yeah,
link |
the binary installer really helps people.
link |
And so that led to spending more time
link |
on that side of things.
link |
So around 2000, so I graduated my PhD in 2000,
link |
end of year, end of 2000.
link |
So 99 doing a lot of work there,
link |
98 doing a lot of work there,
link |
99 kind of spending more time on my PhD,
link |
helping people use the tools,
link |
thinking about what do I want to go from here.
link |
There was a company, there was a guy actually,
link |
Eric Jones and Travis Vought.
link |
They were two friends who founded a company called NTHOT.
link |
It's here in Austin, still here.
link |
And they, Eric contacted me at the time
link |
when I was a graduate student still.
link |
And he said, hey, why don't you come down?
link |
We want to build a company.
link |
We're thinking of a scientific company
link |
and we want to take what you're doing
link |
and kind of add it to some stuff that he'd done.
link |
He'd written some tools.
link |
And then Piero Peterson had done F2Py.
link |
Let's come together and build,
link |
pull this all together and call it SciPy.
link |
So that's the origin of the SciPy brand.
link |
It came from multi pack
link |
and a whole bunch of modules I'd written,
link |
plus a few things from some other folks
link |
and then pulled together in a single installer.
link |
SciPy was really a distribution of Python
link |
masquerading as a library.
link |
How did you think about SciPy in context of Python,
link |
in context of Numeric, like what?
link |
So we saw SciPy as a way to make an R&D environment
link |
for Python, like use Python, depended on Numeric.
link |
So Numeric was the array library we depended on.
link |
And then from there, extend it with a bunch of modules
link |
that allowed for, and at the time,
link |
the original vision of SciPy was to have plotting,
link |
was to have the REPL environment
link |
and kind of really a whole data environment
link |
that you could then install and get going with.
link |
And that was kind of the thinking.
link |
It didn't really evolve that way, right?
link |
It sort of had a, for one,
link |
it's really hard to do massive scale projects
link |
with open source collectives.
link |
Actually, there's sort of an intrinsic cooperation limit
link |
as to which, too many cooks in the kitchen,
link |
you can do amazing infrastructure work.
link |
When it comes down to bringing it all together
link |
into a single deliverable,
link |
that actually requires a little more product management
link |
that is not, that doesn't really emerge
link |
from the same dynamic.
link |
So it struggled, struggled to get almost too many voices.
link |
It's hard to have everybody agree.
link |
Consensus doesn't really work at that scale.
link |
You end up with politics,
link |
with the same kind of things that's happened
link |
in large organizations trying to decide
link |
what to do together.
link |
So consensus building was challenging at scale
link |
as more people came in, right?
link |
Early on, it's fine, because there's nobody there.
link |
So it works, but then as you get more successful
link |
and more people use it, all of a sudden,
link |
oh, there's this scale at which this doesn't work anymore
link |
and we have to come up with different approaches.
link |
So Sidepy came out officially in 2001,
link |
was the first release, most of the time.
link |
I remember the days of getting that release ready.
link |
It was a Windows installer and there were bugs
link |
on how the Windows compiler handled complex numbers
link |
and you were chasing segmentation faults.
link |
And it was, it's a lot of work.
link |
There was a lot of effort had nothing to do
link |
with my area of study.
link |
And at the same time, I had just gotten an offer.
link |
So he wondered if I wanted to come down
link |
and help him start that company with his friend.
link |
And at the time I was like, I was intrigued,
link |
but I was squaring a path, an academic path.
link |
And I had just got an offer to go and teach at my alma mater.
link |
So I took that tenure track position.
link |
And Sidepy, and kind of, then I started to work on Sidepy
link |
as a professor too.
link |
So that's, I left, I've got the Mayo Clinic,
link |
graduated, wrote my thesis using Sidepy,
link |
wrote, you know, there's images that were created.
link |
Now the plotting tool I used was something
link |
from Yorick actually.
link |
It was a plotting, a PLT kind of a plotting language
link |
Yorick is a programming language?
link |
It was a programming language, had a plotting tool,
link |
Dyslin, it had integration to Dyslin.
link |
I ended up using Dyslin plus some of the plotting
link |
from Yorick linked to from Python.
link |
Anyway, it was, people don't plot that way now,
link |
but this is before, and Sidepy was trying to add plotting.
link |
It didn't have much success.
link |
Really the success of plotting came from John Hunter,
link |
who had a similar experience to my experience,
link |
my kind of maverick experience as a person
link |
just trying to get stuff done and kind of having more time
link |
than money maybe, right?
link |
And John Hunter created what?
link |
He's the creator of MapPlotLib.
link |
Yeah, so John Hunter was, you know,
link |
he wasn't a student at the time, but he was an,
link |
he was working in Quant field and he said,
link |
we need better plotting.
link |
So he just went out and said, cool, I'll make a new project
link |
and we'll call it MapPlotLib.
link |
And he released in 2001,
link |
about the same time that Sidepy came out
link |
and it was separate library, separate install,
link |
use numeric, Sidepy use numeric.
link |
And so Sidepy, you know, in 2001, we released Sidepy
link |
and then Endthought created a conference called Sidepy,
link |
which was brought people together to talk about the space.
link |
And that conference is still ongoing.
link |
It's one of the favorite conferences of a lot of people
link |
because it's, you know, it's changed over the years,
link |
but early on it was, you know, a collection of 50 people
link |
who care about, scientists mostly, you know,
link |
practicing scientists who want, who care about coding
link |
and doing it well and not using MATLAB.
link |
And I remember being driven by, you know, I liked MATLAB,
link |
but I didn't like the fact that,
link |
so I'm not opposed to proprietary software.
link |
I'm actually not an open source zealot.
link |
I love open source for the, what it brings,
link |
but I also see the role for proprietary software.
link |
But what I didn't like was the fact that I would develop
link |
code and publish it and then effectively telling somebody
link |
here to run my code, you have to have
link |
this proprietary software.
link |
Right, and there's also culture around MATLAB as much,
link |
because I've talked to a few folks in,
link |
MathWorks creates MATLAB?
link |
I mean, there's just a culture, they try really hard,
link |
but it just, there's this corporate IBM style culture
link |
that's like, or whatever.
link |
I don't want to say negative things about IBM or whatever,
link |
No, it's really that connection.
link |
It's something I'm in the middle of right now
link |
is the business of open source.
link |
And how do you connect the ethos of cooperative development
link |
with the necessity of creating profits, right?
link |
And like right now today, I'm still in the middle of that.
link |
That's actually the early days of me exploring this question.
link |
Cause I was writing SciPy, I mean, as an aside,
link |
I also had, so I had three kids at the time.
link |
I have six kids now.
link |
I got married early, wanted a family.
link |
I had three kids and I remember reading,
link |
I read Richard Stallman's post and I was a fan of Stallman.
link |
I would read his work, I liked this collective ideas
link |
Certainly the ideas on IP law, I read a lot of his stuff.
link |
But then he said, okay, well,
link |
how do I make money with this?
link |
How do I make a living?
link |
How do I pay for my kids?
link |
All this stuff was in my mind,
link |
young graduate student making no money,
link |
thinking I got to get a job.
link |
And he said, well, I think just be like me
link |
and don't have kids, right?
link |
That's just, don't, don't.
link |
That's his take on it.
link |
That was what he said in that moment, right?
link |
That's the thing I read and I went,
link |
okay, this is a train I can't get on.
link |
There has to be a way to preserve the culture
link |
of open source and still be able to make sufficient money
link |
to feed your kids.
link |
Yes, exactly, there's gotta be.
link |
Well, so that actually led me to a study of economics.
link |
Because at the time I was ignorant and I really was.
link |
I'm actually, I'm embarrassed for educational system
link |
that they could let me and I was valedictorian
link |
in my high school class and I did super well in college.
link |
And like academically I did great, right?
link |
But the fact that I could do that and then be clueless
link |
about this key part of life,
link |
it led me to go, there's a problem.
link |
Like I should have learned this in fifth grade.
link |
I should have learned this in eighth grade.
link |
Like everybody should come out
link |
with a basic knowledge of economics.
link |
You're an interesting example because you've created tools
link |
that change the lives of probably millions of people
link |
and the fact that you don't understand at the time
link |
of the creation of those tools, the basics economics
link |
of how like to build up a giant system is the problem.
link |
Yeah, it's a problem.
link |
And so during my PhD at the same time,
link |
this is back in 98, 99 at the same time,
link |
I was in a library, I was reading books on capitalism,
link |
I was reading books on Marxism,
link |
I was reading books on what is this thing?
link |
What does it mean?
link |
And I encountered, basically I encountered a set of writings
link |
from people that said they were the inheritors of Adam Smith.
link |
Read Adam Smith for the first time, right?
link |
Which is the wealth of nations
link |
and kind of this notion of emergent societies
link |
and realized, oh, there's this whole world out here
link |
of people and the challenge of economics is also political.
link |
Like, cause economics, people, different parties
link |
running for office, they want their economic friends.
link |
They want their economists to back them up, right?
link |
Or to be their magicians, like the magicians
link |
in Pharaoh's court, right?
link |
The people that are kind of say, hey, this is,
link |
you should listen to me because I've got the expert
link |
And so it gets really muddled, right?
link |
But I was looking at it from as a scientist going,
link |
what is this space?
link |
What does this mean?
link |
How does Paris get fed?
link |
How does, what is money?
link |
And I found a lot of writings that I really loved.
link |
I found some things that I really loved
link |
and I learned from that.
link |
It was writings from people like Von Missess.
link |
He wrote a paper in 1920 that still should be read
link |
It was the economic calculation problem
link |
of the socialist commonwealth.
link |
It was basically in response
link |
to the Bolshevik revolution in 1917.
link |
And his basic argument was it's not gonna work
link |
to not have private property.
link |
You're not gonna be able to come up with prices.
link |
The bureaucrats aren't gonna be able to determine
link |
how to allocate resources without a price system.
link |
And a price system emerges from people making trades.
link |
And they can only make trades if they have authority
link |
over the thing they're trading.
link |
And that creates information flow
link |
that you just don't have if you try to top down it.
link |
And it's like, huh, that's a really good point.
link |
Yeah, the prices have a signal that's used.
link |
And it's important to have that signal
link |
when you're trying to build a community
link |
of productive people like you would
link |
in the software engineering space.
link |
Yeah, the prices are actually
link |
an important signaling mechanism.
link |
Right, and that money is just a bartering tool.
link |
Right, so this is the first time I've encountered
link |
any of this concept, right, and the fact that,
link |
oh, this is actually really critical.
link |
Like it's so critical to our prosperity
link |
and that we're dangerously not learning about this,
link |
not teaching our children about this.
link |
So you had the three kids,
link |
you had to make some hard decisions.
link |
I had to make some money, right, had to figure it out.
link |
But I didn't really care.
link |
I mean, I've never been driven by money, just need it.
link |
Yeah, right, need to eat.
link |
So how did that resolve itself in terms of site buy?
link |
So I would say it didn't really resolve itself.
link |
It sort of started a journey that I'm continuing on.
link |
I'm still on, I would say.
link |
I don't think it resolved itself.
link |
But I will say I went in eyes wide open.
link |
Like I knew that there were problems
link |
with giving stuff away and creating the market externalities
link |
that the fact that, yeah, people might use it
link |
and I might not get paid for it
link |
and I'll have to figure something else out to get paid.
link |
Like at least I can say I'm not bitter
link |
that a lot of people have used stuff that I've written
link |
and I haven't necessarily benefited economically from it.
link |
I've heard other people be bitter about that
link |
when they write or they talk.
link |
Like, oh, I should've got more value out of this.
link |
And I'm also, I want to create systems
link |
that let people like me who might have these desires
link |
to do things, let them benefit.
link |
So it actually creates more of the same.
link |
Not to turn on your bitterness module,
link |
but there's some aspect, I wish there was mechanisms for me
link |
to reward whoever created side buy and non buy
link |
because it brought so much joy to my life.
link |
I appreciate that.
link |
You know what I mean?
link |
The tip dark notion was there.
link |
I appreciate that.
link |
But there should be a very frictionless mechanism.
link |
There should be a frictionless mechanism.
link |
I would love to talk about some of the ideas I have
link |
because I actually came across,
link |
I think I've come up with some interesting notions
link |
that could work, but they'll require anything that will work
link |
takes time to emerge, right?
link |
Like things don't just turn overnight.
link |
That's definitely one thing I've also understood
link |
and learned is any fixes, that's why it's kind of funny.
link |
We often give credit to, oh, this president gets elected
link |
and oh, look how great things have done.
link |
And I saw that when I had a transition in a condo
link |
when a new CEO came in, right?
link |
And it's like the success that's happening,
link |
there's an inertia there.
link |
Yeah, and sometimes the decision you made
link |
like 10 years before is the reason why the success is the.
link |
So we're sort of just running around taking credit
link |
The credit assignment has like a delay to it
link |
that makes the credit assignment basically wrong
link |
Wrong more than right, exactly.
link |
And so I'm like, oh, this is, you know,
link |
that's the stuff I would read a ton about, you know,
link |
So I don't, I feel like I'm with you.
link |
Like I want the same thing.
link |
I want to be able to, and honestly, not for personally,
link |
I feel like I don't have any, I mean,
link |
we've been done reasonably okay, but I've had to pursue it.
link |
Like that's really what started my trajectory from academia
link |
is reading that stuff led me to say,
link |
oh, entrepreneurship matters.
link |
So I love software, but we need more entrepreneurs
link |
and I wanna understand that better.
link |
So once I kind of had that virus infect my brain,
link |
even though I was on a trajectory
link |
to go to a tenure track position at a university
link |
and I was there for six years,
link |
I was kind of already out the door when I started.
link |
And we can get into that, but.
link |
Well, can I just ask you a quick question on,
link |
is there some design principles
link |
that were in your mind around SciPy?
link |
Like, is there some key ideas
link |
that were just like sticking to you
link |
that this is the fundamental ideas?
link |
Yeah, I would say so.
link |
I would think it's basically accessibility to scientists,
link |
like give them, give scientists and engineers tools
link |
that they don't have to think a lot about programming.
link |
So give them really good building blocks,
link |
give them functions that they wanna call
link |
and sort of just the right length of spelling.
link |
There's one tradition in programming where it's like,
link |
make very, very long names, right?
link |
And you can see it in some programming languages
link |
where the names get, take half the screen.
link |
And in the 4chan world, characters had to be six letters
link |
And that's way too much, too little.
link |
But I was like, I liked to have names
link |
that were informative but short.
link |
So even though Python, well this is a different conversation,
link |
but documentation is doing some work there.
link |
So when you look at great scientific libraries
link |
and functions, there's a richness of documentation
link |
that helps you get into the details.
link |
The first glance at a function gives you the intuition
link |
of all it needs to do by looking at the headers and so on.
link |
But to get the depths of all the complexities involved,
link |
all the options involved,
link |
documentation does some of the work.
link |
Documentation is essential, yeah.
link |
So that was actually a, so we thought about several things.
link |
One is we wanted plotting.
link |
We wanted interactive environment.
link |
We wanted good documentation.
link |
These are things we knew, we wanted.
link |
The reality is those took about 10 years to evolve, right?
link |
Given the fact that we didn't have a big budget,
link |
it was all volunteer labor.
link |
It was sort of, when nthought got created
link |
and they started to try to find projects,
link |
people would pay for pieces
link |
and they were able to fund some of it.
link |
Not nearly enough to keep up with what was necessary.
link |
And no criticism, just simply the reality.
link |
I mean, it's hard to start a business
link |
and then do consulting and then also
link |
promote an open source project that's still fairly new.
link |
Cypo is fairly niche.
link |
We stayed connected all while I was a student,
link |
sorry, a professor.
link |
I went to BYU and started to teach.
link |
Electrical engineering, all the applied math courses.
link |
I loved teaching single processing,
link |
probability theory, electromagnetism.
link |
I was, if you look at writing my professor,
link |
which my kids loved to do,
link |
I wasn't, I got some bad reviews because people.
link |
What was the criticism?
link |
I would speak too high of a level.
link |
Like I definitely had a calibration problem
link |
coming out of graduate work
link |
where I hate to be condescending to people.
link |
Like I really have a ton of respect for people fundamentally.
link |
Like my fundamental thing is I respect people.
link |
Sometimes that can lead to a,
link |
I was thinking they had more knowledge than they did.
link |
And so I would just speak at a very high level,
link |
assume they got it.
link |
But they need to rise to the standard that you set.
link |
I mean, that's one of the,
link |
some of the greatest teachers do that.
link |
And that was kind of what was inspiring me.
link |
But you also have to,
link |
I cannot say I was articulate
link |
with some of the greatest teachers, right?
link |
I was, like one classic example,
link |
when I first taught at BYU,
link |
my very first class, it was overheads,
link |
transparencies, overheads.
link |
Before projectors were really that common,
link |
I taught transparencies.
link |
I'm writing my notes out.
link |
I go in, room's half dark.
link |
I just blaring through these transparencies.
link |
Here it is, here it is, here it is.
link |
And I did give a quiz after two weeks.
link |
No one knew anything.
link |
Nothing I had taught had gotten anywhere.
link |
And I realized, okay, I'm not, this is not working.
link |
So I put away the transparencies
link |
and I turned around and just started using the chalkboard.
link |
And what it did is it slowed me down, right?
link |
The chalkboard just slowed me down
link |
and gave people time to process and to think.
link |
And then that made me focus.
link |
My writing wasn't great on the chalkboard,
link |
but I really love that part of like the teaching.
link |
So that entered SciPy's world in terms of,
link |
we always understood that there's a didactic aspect
link |
of SciPy, kind of how do you take the knowledge
link |
and then produce it?
link |
The challenge we had was the scope.
link |
Like ultimately SciPy was everything, right?
link |
And so 2001, when it first came out,
link |
people were starting to use it.
link |
No, this is cool, this is a tool we actually use.
link |
At the same time, 2001 timeframe,
link |
there was a little bit of like the Hubble Space Telescope,
link |
the folks at Hubble that started to say,
link |
hey, Python, we're gonna use Python
link |
for processing images from Hubble.
link |
And so Perry Greenfield was a good friend
link |
in running that program.
link |
And he had called me before I left WIU and said,
link |
you know, we wanna do this,
link |
but numeric actually has some challenges in terms of,
link |
you know, it's not, the array doesn't have enough types.
link |
We need more operations.
link |
You know, broadcasting needs to be a little more settled.
link |
They wanted record arrays.
link |
They wanted, you know, record arrays are like a data frame,
link |
but a little bit different,
link |
but they wanted more structured data.
link |
So he had called me even early on then,
link |
and he said, you know, what,
link |
would you wanna work on something to make this work?
link |
And I said, yeah, I'm interested, but I'm going here,
link |
and I, you know, we'll see if I have time.
link |
So in the meantime, while I was teaching
link |
and SciPy was emerging, and I had a student,
link |
I was constantly, while I was teaching,
link |
trying to figure a way to fund this stuff.
link |
So I had a graduate student, my only graduate student,
link |
a Chinese fellow, Liu Hongze is his name, great guy.
link |
He wrote a bunch of stuff for iterative linear algebra,
link |
like got into writing some of the iterative
link |
linear algebra tools that are currently there in SciPy,
link |
and they've gotten better since,
link |
but this is in 2005, kept working on SciPy,
link |
but Perry has started working on a replacement
link |
to numeric called NumArray.
link |
And in 2004, a package called ND Image,
link |
it was an image processing library
link |
that was written for NumArray,
link |
and it had in it a morphology tool.
link |
I don't know if you know what morphology is.
link |
It's open, dilations, closed, you know,
link |
there was sort of this, as a medical imaging student,
link |
I knew what it was,
link |
because it was used in segmentation a lot.
link |
And in fact, I'd wanted to do something like that
link |
in Python, in SciPy, but just had never gotten around to it.
link |
So when it came out, but it worked only on NumArray,
link |
and SciPy needed numeric,
link |
and so we effectively had the beginning of this split.
link |
And numeric and NumArray didn't share data,
link |
they were just two, so you could have a gigabyte
link |
of numeric, NumArray data, and gigabyte of numeric data,
link |
and they wouldn't share it.
link |
And so you had these,
link |
then you had these scientific libraries written on top.
link |
I got really bugged by that.
link |
I got really like, oh man, this is not good,
link |
we're not cooperating now,
link |
we're sort of redoing each other's work,
link |
and we're just this young community.
link |
So that's what led me, even though I knew it was risky,
link |
because my, you know, I was on a tenure track position,
link |
2004 I got reviewed.
link |
They said, hey, things are going okay,
link |
you're doing well, paper's coming out,
link |
but you're kind of spending a lot of time
link |
doing this open source stuff, maybe do a little less of that,
link |
and a little more of the paper writing and grant writing,
link |
which was naive, but it was definitely the thinking.
link |
You're basically creating a thing
link |
which enables science in the 21st century.
link |
Maybe don't emphasize that so much in your free year tenure.
link |
It illustrates some of the challenges.
link |
It does, and it's, people mean well.
link |
Like, but we've gotten broken in a bunch of ways.
link |
Certain things, programming,
link |
understanding the role of software engineering,
link |
programming in society is a little bit lacking.
link |
Now, I was in electrical engineering position.
link |
That's even worse there.
link |
Yeah, it was very, they were very focused,
link |
and so, you know, good people, and I had a great time,
link |
I loved my time, I loved my teaching,
link |
I loved all the things I did there.
link |
The problem was, the split was happening
link |
in this community that I loved, right?
link |
I saw people, and I went, oh my gosh,
link |
this is gonna be, this is not great,
link |
and so I happened, you know, fate,
link |
I had a class I had signed up for,
link |
it's a, I was trying to build an MRI system,
link |
so I had a kind of a radio, instead of a radio,
link |
a digital radio class, it was a digital MRI class.
link |
And I had people sign up, two people signed up,
link |
then they dropped, and so I had nobody in this class.
link |
So, and I didn't have any other courses to teach,
link |
and I thought, oh, I've got some time,
link |
and I'll just write, I'll just write a replace,
link |
a merger of Numerica Numeray.
link |
Like, I'll basically take the numeric code base
link |
at the features Numeray was adding,
link |
and then kind of come up with a single array library
link |
that everybody can use.
link |
So that's where NumPy came from,
link |
was my thinking, hey, I can do this,
link |
and who else is going to?
link |
Because at that point, I'd been around the community
link |
long enough, and I'd written enough C code,
link |
I knew, I knew the structures, and I,
link |
in fact, my first contribution to numeric
link |
had been writing the CAPI documentation
link |
that went in the first documentation for NumPy,
link |
for numeric, sorry, this is Paul DuBois,
link |
David Asher, Conrad Hinson, and myself.
link |
I got credit because I wrote this chapter,
link |
which is all the CAPI of Numerica, all the C stuff.
link |
So I said, I'm probably the one to do it,
link |
and nobody else is gonna do this.
link |
So it was sort of, out of a sense of duty and passion,
link |
knowing that, eh, I don't think my academic,
link |
I don't think the department here is gonna appreciate this,
link |
but it's the right thing to do.
link |
Can we just link on that moment?
link |
Because the importance of the way you thought
link |
and the action you took, I feel is understated
link |
and is rare and I would love to see so much more of it
link |
because what happens as the tools become more popular,
link |
there's a split that happens.
link |
And it's a truly heroic and impactful action
link |
to in those early, in that early split,
link |
to step up and it's like great leaders throughout history,
link |
like get, what is the brave heart,
link |
like get on a horse and rile the troops
link |
because I think that can have, make a big difference.
link |
We have TensorFlow versus PyTorch
link |
in the machine learning community.
link |
We have the same problem today.
link |
It's actually bigger.
link |
I wonder if it's possible in the early days
link |
to rally the troops.
link |
It is possible, especially in the early days.
link |
The longer it goes, the harder, right?
link |
The more energy in the factions, the harder.
link |
But in the early days, it is possible
link |
and it's extremely helpful
link |
and there's a willingness there,
link |
but the challenge is there's just not a willingness
link |
There's not a willingness to, you know,
link |
like I was literally walking into a field
link |
saying I'm going to do this
link |
and here I am, like, you know,
link |
I have five kids at home now.
link |
Sometimes my wife hears these stories
link |
and she's like, you did what?
link |
I thought we were going to,
link |
I thought you were actually on a path
link |
to make sure we had resources and money, but,
link |
but again, there's a, there's an aspect,
link |
I'm a very hopeful person.
link |
I'm an optimistic person by nature.
link |
I learned that about myself later on.
link |
And part of my, my religious beliefs
link |
actually lead to that.
link |
And it's why I hold them dear
link |
because it's actually how I feel about,
link |
that's what leads me to these attitudes,
link |
sort of this hopefulness and this sense of,
link |
yeah, it may not work out for me financially
link |
or maybe, but that's not the ultimate gain.
link |
Like that's a thing, but it's not,
link |
that's not the scorecard for me.
link |
And so I just wanted to be helpful
link |
and I knew, and partly because these SciPy conferences,
link |
because the maintenance conversations,
link |
I knew there was a lot of need for this, right?
link |
And so I had this, it wasn't like I was alone
link |
in terms of no feedback.
link |
I had these people who knew, but it was crazy.
link |
Like people who at the time said,
link |
yeah, we didn't think you'd be able to do it.
link |
We thought it was crazy.
link |
And also instructive, like practically speaking,
link |
that you had a cool feature
link |
that you were chasing the morphology, like the.
link |
Like it's not just like.
link |
There's an end result.
link |
It's not some visionary thing.
link |
I'm going to unite the community.
link |
You were like. Correct.
link |
You were actually practically,
link |
this is what one person actually could do
link |
and actually build.
link |
Cause that is important.
link |
Cause you can get over your skis.
link |
You can definitely get over your skis.
link |
And I had, in fact, this almost got me over my skis, right?
link |
I would say, well, in retrospect, I hate looking back.
link |
I can tell you all the flaws with NumPy, right?
link |
When I go into it, there's lots of stuff that I'm like,
link |
oh man, that's embarrassing.
link |
I wish I had somebody stop me with a wet fish there.
link |
Like I needed, like what I'd wished I'd had
link |
was somebody with more experience and certainly library
link |
writing and array library.
link |
There's like, I wish I had me.
link |
I could go back in time and go do this, do that.
link |
There's a more important thing.
link |
Cause there's things we did that are still there
link |
that are problematic, that created challenges for later.
link |
And I didn't know it at the time.
link |
Didn't understand how important that was.
link |
And in many cases, didn't know what to do.
link |
Like there was pieces of the design of NumPy.
link |
I didn't know what to do until five years ago.
link |
Now I know what they should have been, Ben.
link |
But I didn't know at the time and nobody,
link |
and I couldn't get the help.
link |
Anyway, so I wrote it.
link |
It took about, it took four months to write
link |
the first version, then about 14 months to make it usable.
link |
But it was, it wasn't, it was that first four months
link |
of intense writing, coding, getting something out the door
link |
that worked that was, it was, it was definitely challenging.
link |
And then the big thing I did was create a new type object
link |
That was probably the contribution.
link |
And then the fact that I added broad, not just broadcasting,
link |
but advanced indexing so that you could do masked indexing
link |
and indirect indexing instead of just slicing.
link |
So for people who don't know, and maybe you can elaborate,
link |
NumPy, I guess the vision in the narrowest sense
link |
is to have this object that represents
link |
n dimensional arrays.
link |
And like at any level of abstraction you want,
link |
but basically it could be a black box
link |
that you can investigate in ways that you would naturally
link |
want to investigate such objects.
link |
So you could do math on it easily.
link |
Math on it easily, yeah.
link |
So it had an associated library of math operations
link |
and effectively SciPy became an even larger operate set
link |
of math operations.
link |
So the key for me was I was going to write NumPy
link |
and then move SciPy to depend on NumPy.
link |
In fact, early on, one of the initial proposals
link |
was that we would just write SciPy
link |
and it would have the numeric object inside of it.
link |
And it'd be SciPy.array or something.
link |
That turned out to be problematic because numeric
link |
already had a little mini library of linear algebra
link |
and some functions, and it had enough momentum,
link |
enough users that nobody wanted to,
link |
they wanted backward compatibility.
link |
One of the big challenges of NumPy
link |
was I had to be backward compatible
link |
with both numeric and NumArray
link |
in order to allow both of those communities to come together.
link |
There was a ton of work in creating
link |
that backward compatibility
link |
that also created echoes in today's object.
link |
Like some of the complexity in today's object
link |
is actually from that goal of backward compatibility
link |
to these other communities,
link |
which if you didn't have that, you'd do something different,
link |
which is instructive because a lot of things are there.
link |
You think, what is that there for?
link |
It's like, well, it's a remnant.
link |
It's an artifact of its historical existence.
link |
By the way, I love the empathy
link |
and the lack of ego behind that
link |
because I feel, you see that in the split
link |
in the JavaScript framework, for example,
link |
the arbitrary branching.
link |
I think in order to unite people,
link |
you have to kind of put your ego aside
link |
and truly listen to others.
link |
What do you love about NumArray?
link |
What do you love about Numeric?
link |
Like actually get a sense,
link |
we were talking about languages earlier,
link |
sort of empathize to the culture,
link |
the people that love something about this particular API,
link |
some of the naming style
link |
or the actual usage patterns
link |
and truly understand them
link |
and so that you can create that same draw
link |
in the united thing. I completely agree.
link |
I completely agree.
link |
And you have to also have enough passion
link |
that you'll do it.
link |
It can't be just like a perfunctory,
link |
oh yes, I'll listen to you
link |
and then I'm not really that excited about it.
link |
So it really is an aspect,
link |
it's a philosophical, like there's a philia,
link |
there's a love of esteeming of others.
link |
It's actually at the heart of what,
link |
it's sort of a life philosophy for me, right?
link |
That I'm constantly pursuing and that helped,
link |
absolutely helped.
link |
Makes me wonder in a philosophical,
link |
like looking at human civilization as one object,
link |
it makes me wonder how we can copy and paste Travis's
link |
Well, some aspects, maybe.
link |
Some aspects, right, right, exactly.
link |
Well, it's a good question.
link |
How do we teach this?
link |
How do we encourage it?
link |
How do we lift it?
link |
Because so much of the software world,
link |
it's giant communities, right?
link |
But it seems like so much is moved by,
link |
like little individuals.
link |
You talk about like Linus Torvalds.
link |
It's like, could you have not,
link |
could you have had Linux without him?
link |
Yeah, Guido and Python.
link |
Well, the iPy community particularly,
link |
it's like I said, we wanted to build this big thing,
link |
but ultimately we didn't.
link |
What happened is we had Mavericks and champions
link |
like John Hunter who created Matplotlib.
link |
We had Fernando Perez who created iPython.
link |
And so we sort of inspired each other,
link |
but then it kind of, there's sort of a culture
link |
of this selfless giving, the stewardship mentality,
link |
as opposed to ownership mentality,
link |
but stewardship and community focused,
link |
community focused, but intentional work.
link |
Like not waiting for everybody else to do the work,
link |
but you're doing it for the benefit of others
link |
and not worried about what you're gonna get.
link |
You're not worried about the credit.
link |
You're not worried about what you're gonna get.
link |
You're worried about, I later realized
link |
that I have to worry a little about credit,
link |
not because I want the credit,
link |
because I want people to understand
link |
what led to the results.
link |
Like, I don't, it's not about me.
link |
It's I want to understand this is what led to the result.
link |
So let's like, I think doing,
link |
and this is what had no impact on the result.
link |
Like let's promote, just like you said,
link |
I want to promote the attributes
link |
that help make us better off.
link |
How do we make more of West McKinney?
link |
Like West McKinney was critical to the success of Python
link |
because of his creation of pandas,
link |
which is the roots of that were all the way back
link |
in numeric and num array and numpy,
link |
where numpy created an array of records.
link |
West started to use that almost like a data frame,
link |
except it's an array of records.
link |
And data frame, the challenge is,
link |
okay, if you want to augment it at another column,
link |
you have to insert, you have to do all this memory movement
link |
to insert a column.
link |
Whereas data frames became,
link |
oh, I'm going to have a loose collection of arrays.
link |
So it's a record of arrays that is a part of a data frame.
link |
And we thought about that back in the memory days,
link |
but West ended up doing the work to build it.
link |
And then also the operations that were relevant
link |
for data processing.
link |
What I noticed is just that each of these little things
link |
creates just another tick, another up.
link |
So numpy ultimately took a little while,
link |
about six months in, people started to join me,
link |
Francesc Altad, Robert Kern, Charles Harris.
link |
And these people are many of the unsung heroes, I would say.
link |
People who are, you know,
link |
they sometimes don't get the credit they deserve
link |
because they were critical both to support,
link |
like, you know, it's hard and you want,
link |
you need some support, people need support.
link |
And I needed just encouragement.
link |
And they were helping and encouraged by contributing.
link |
And once, the big thing for me was when John Hunter,
link |
he had previously done kind of a simple thing
link |
called numerics to kind of, you know, between numeric
link |
and numerae, he had a little high level tool
link |
that would just select each one for matplotlib.
link |
In 2006, he finally said,
link |
we're gonna just make numpy the dependency of matplotlib.
link |
As soon as he did that,
link |
and I remember specifically when he did that,
link |
I said, okay, we've done it.
link |
Like, that was when I knew we had to see success.
link |
Before then it was still unsure,
link |
but that kind of started a roller coaster.
link |
And then 2006 to 2009.
link |
And then I've been floored by what it's done.
link |
Like, I knew it would help.
link |
I had no idea how much it would help.
link |
And it has to do with, again, the language thing.
link |
It just, people started to think in terms of numpy.
link |
And that opened up a whole new way of thinking.
link |
And part of the story that you kind of mentioned,
link |
but maybe you can elaborate,
link |
is it seems like at some point in the story,
link |
Python took over science and data science.
link |
And bigger than that,
link |
the scientific community started to think like programmers
link |
or started to utilize the tools of computers to do,
link |
like at a scale that wasn't done with Fortran.
link |
Like at this gigantic scale,
link |
they started to open in their heart.
link |
And then Python was the thing.
link |
I mean, there's a few other competitors, I guess,
link |
but Python, I think, really, really took over.
link |
There's a lot of stories here
link |
that are kind of during this journey,
link |
because this is sort of the start of this journey in 2005, 2006.
link |
So my tenure committee, I applied for tenure in 2006, 2007.
link |
It came back, I split the department.
link |
I was very polarizing.
link |
I had some huge fans
link |
and then some people that said no way, right?
link |
So it was very, I was a polarizing figure in the department.
link |
It went all the way up to the university president.
link |
Ultimately, my department chair had the sway
link |
and they didn't say no.
link |
They said, come back in two years and do it again.
link |
And I went, eh, at that point, I was like,
link |
I mean, I had this interest in entrepreneurship,
link |
this interest in not the academic circles,
link |
not the, like, how do we make industry work?
link |
So I do have to give credit to that exploration of economics
link |
because that led me, oh, I had a lot of opinions.
link |
I was actually very libertarian at the time.
link |
And I still have some libertarian trends,
link |
but I'm more of a, I'm more of a collectivist libertarian.
link |
So you value broadly, philosophically freedom.
link |
I value broadly, philosophically freedom,
link |
but I also understand the power of communities,
link |
like the power of collective behavior.
link |
And so what's that balance, right?
link |
So by the time I was just,
link |
I gotta go out and explore this entrepreneur world.
link |
So I left academia.
link |
I said, no thanks, called my friend, Eric, here,
link |
who had, his company was going.
link |
I said, hey, could I join you and start this trend?
link |
And he, at that time they were using SciFi a lot.
link |
They were trying to get clients.
link |
And so I came down to Texas.
link |
And in Texas is where I sort of,
link |
it's my entrepreneur world, right?
link |
I left academia and went to entrepreneur world in 2007.
link |
So I moved here in 2007, kind of took a leap,
link |
knew nothing really about business,
link |
knew nothing about a lot of stuff there.
link |
There's, you know, for a long time,
link |
I've kept some connections to a lot of academics
link |
because I still value it.
link |
I still love the scientific tradition.
link |
I still value the essence and the soul and the heart
link |
of what is possible.
link |
Don't like a lot of the administration
link |
and the kind of, we can go into detail about why
link |
and where and how this happens,
link |
what are some of the challenges.
link |
I don't know, but I'm with you.
link |
So I'm still affiliated with MIT.
link |
I still love MIT because there's magic there.
link |
There's people I talk to, like researchers, faculty,
link |
in those conversations and the whiteboard
link |
and just the conversation, that's magic there.
link |
All the other stuff, the administration,
link |
all that kind of stuff seems to,
link |
you don't wanna say too harshly criticize
link |
sort of bureaucracies, but there's a lag
link |
that seems to get in the way of the magic.
link |
And I'm still have a lot of hope
link |
that that can change because I don't often see
link |
that particular type of magic elsewhere in the industry.
link |
So like we need that and we need that flame going.
link |
And it's the same thing as exactly as you said,
link |
it has the same kind of elements
link |
like the open source community does.
link |
And, but then if you, like the reason I stepped away,
link |
the reason I'm here, just like you did in Austin is like,
link |
if I wanna build one robot, I'll stay at MIT.
link |
But if I wanna build millions and make money enough
link |
to where I can explore the magic of that, then you can't.
link |
And I think that dance is...
link |
That translational dance has been lost a bit, right?
link |
And there's a lot of reasons for that.
link |
I'm not, I'm certainly not an expert on this stuff.
link |
I can opine like anybody else,
link |
but I realized that I wanted to explore entrepreneurship,
link |
which I, and really figure out,
link |
and it's been a driving passion for 20 years, 25 years.
link |
How do we connect capital markets and company?
link |
Cause again, I fell in love with the notion of,
link |
oh, profit seeking on its own is not a bad thing.
link |
It's actually a coordination mechanism
link |
for allocating resources that, you know,
link |
in an emergent way, right?
link |
That respects everybody's opinions, right?
link |
So this is actually powerful.
link |
So I say all the time, when I make a company
link |
and we do something that makes profit,
link |
what we're saying is, hey,
link |
we're collecting of the world's resources
link |
and voluntarily people are asking us
link |
to do something that they like.
link |
And that's a huge deal.
link |
And so I really liked that energy.
link |
So that's what I came to do and to learn
link |
and to try to figure out.
link |
And that's what I've been kind of stumbling through
link |
since for the past 14 years.
link |
And so you were still working at NoPi.
link |
So NoPi was just emerging.
link |
One of the things I've done,
link |
it's worth mentioning because it emphasizes
link |
the exploratory nature of my thinking at the time.
link |
I said, well, I don't know how to fund this thing.
link |
I've got a graduate student I'm paying for
link |
and I've got no funding for him.
link |
And I had done some fundraising from the public
link |
to try to get public fundraisers in my lab.
link |
I didn't really wanna go out
link |
and just do the fundraising circuit
link |
the way it's traditionally done.
link |
So I wrote a book and I said, I'm gonna write a book
link |
and I'm gonna charge for it.
link |
It was called Guide to NoPi.
link |
And so ultimately NoPi became
link |
documentation driven development
link |
because I basically wrote the book
link |
and made sure the stuff worked or the book would work.
link |
So it really helped actually make NoPi become a thing.
link |
So writing that book,
link |
and it's not a page turner.
link |
Guide to NoPi is not a book you pick up
link |
and go, oh, this is great, over the fire.
link |
But it's where you could find the details,
link |
like how'd all this work.
link |
And a lot of people love that book.
link |
And so a lot of people ended up,
link |
so I said, look, I need to, so I'm gonna charge for it.
link |
And I got some flack for that.
link |
Not that much, just probably five angry messages,
link |
people yelling at me saying I was a bad guy
link |
for charging for this book.
link |
Was one of them Richard Stallman?
link |
No, I haven't really had any interaction with him personally,
link |
like I said, but there were a few,
link |
but actually surprisingly not.
link |
There was actually a lot of people like,
link |
no, it's fine, you can charge for a book.
link |
That's no big deal.
link |
We know that's a way you can try to make money
link |
around open source.
link |
So what I did, I did it in an interesting way.
link |
I said, well, kind of my ideas around IP law and stuff.
link |
I love the idea you can share something, you can spread it.
link |
Like once it's, the fact that you have a thing
link |
and copying is free, but the creation is not free.
link |
So how do you fund the creation and allow the copying?
link |
And in software, it's a little more complicated than that
link |
because creation is actually a continuous thing.
link |
It's not like you build a widget and it's done.
link |
It's sort of a process of emerging
link |
and continuing to create.
link |
But I wrote the book
link |
and had this market determined price thing.
link |
I said, look, I need, I think I said 250,000.
link |
If I make 250,000 from this book, I'll make it free.
link |
So as soon as I get that much money,
link |
or I said five years, so there's a time limit.
link |
Like it's not forever.
link |
That's really cool.
link |
I released it on this.
link |
And it's actually interesting
link |
because one of the people
link |
who also thought that was interesting
link |
ended up being Chris White,
link |
who was the director of DARPA project
link |
that we got funding through at Anaconda.
link |
And the reason he even called us back
link |
is because he remembered my name from this book
link |
and he thought that was interesting.
link |
And so even though we hadn't gone to the demo days,
link |
we applied and the people said, yeah,
link |
nobody ever gets this without coming to the demo day first.
link |
This is the first time I've seen it.
link |
But it's because I knew, you know,
link |
Chris had done this and had this interaction.
link |
So it did have impact.
link |
I was actually really, really pleased by the result.
link |
I mean, I ended up in three years, I made 90,000.
link |
So sold 30,000 copies by myself.
link |
I just put it up on, you know, use PayPal and sold it.
link |
And that was my first taste of kind of, okay,
link |
this can work to some degree.
link |
And I, you know, all over the world, right?
link |
From Germany to Japan to, it was actually, it did work.
link |
And so I appreciated the fact that PayPal existed
link |
and I had a way to get the money, the distribution was simple.
link |
This is pre Amazon book stuff.
link |
So it was just publishing a website.
link |
It was the popularity of SciPy emerging
link |
and getting company usage.
link |
I ended up not letting it go the five years
link |
and not trying to make the full amount
link |
because, you know, a year and a half later,
link |
I was at Enthought.
link |
I had left academia as an Enthought
link |
and I kind of had a full time job.
link |
And then actually what happened is the documentation people,
link |
there's a group that said, hey,
link |
we want to do documentation for SciPy as a collective.
link |
And they're essentially needing the stuff in the book, right?
link |
And so they kind of ask,
link |
hey, could we just use the stuff in your book?
link |
And at that point I said, yeah, I'll just open it up.
link |
So that's, but it has served its purpose.
link |
And the money that I made actually funded my grad student.
link |
Like it was actually, you know,
link |
I paid him 25,000 a year out of that money.
link |
So the funny thing is if you do a very similar
link |
kind of experiment now with NumPy or something like it,
link |
you could probably make a lot more.
link |
It's probably true.
link |
Because of the tooling and the community building.
link |
Like the, and social media,
link |
that there's just a virality to that kind of idea.
link |
There'd be things to do.
link |
I've thought about that.
link |
And really I thought about a couple of books
link |
or a couple of things that could be done there.
link |
And I just haven't, right?
link |
Even, I tried to hire a ghostwriter this year too
link |
to see if that could help, but it didn't.
link |
But part of my problem is this,
link |
I've been so excited by a number of things
link |
that have stemmed from that.
link |
Like, so I came here, worked at Enthought for four years,
link |
graciously, Eric made me president.
link |
Then we started to work closely together.
link |
We actually helped him buy out his partner.
link |
It didn't end great.
link |
Like unfortunately Eric and I aren't real,
link |
aren't friends now.
link |
I still respect him.
link |
I have a lot, I wish we were,
link |
but he didn't like the fact that Peter and I
link |
started Anaconda, right?
link |
That was not, I mean, so there's two sides to that story.
link |
So I'm not gonna go into it, right?
link |
But you, as human beings
link |
and you wish you still could be friends.
link |
I mean, that's a story of great minds
link |
building great companies.
link |
Somehow it's sad that when there's that kind of.
link |
And I hold him in esteem.
link |
I'm grateful for him.
link |
I think Enthought still exists.
link |
They're doing great work helping scientists.
link |
They still run the SciPy conference.
link |
They have an R&D platform they're selling now
link |
that's a tool that you can go get today, right?
link |
So Enthought has played a role in the SciPy
link |
in supporting the community around SciPy, I would say.
link |
They ended up not being able to,
link |
they ended up building a tool suite
link |
to write GUI applications.
link |
Like that's where they could actually make
link |
that the business could work.
link |
And so supporting SciPy and NumPy itself
link |
wasn't as possible.
link |
Like they didn't, they tried.
link |
I mean, it was not just because,
link |
it was just because of the business aspect.
link |
So, and I wanted to build a company that could do,
link |
that could get venture funding, right?
link |
I mean, that's a longer story.
link |
We could talk a lot about that, but.
link |
And that's where Anaconda came to be.
link |
That's where Anaconda came to be.
link |
So let me ask you, it's a little bit for fun
link |
because you built this amazing thing.
link |
And so let's talk about like an old warrior
link |
looking over old battles.
link |
You've, you know, there's a sad letter in 2012
link |
that you wrote to the NumPy mailing list
link |
announcing that you're leaving NumPy.
link |
And some of the things you've listed
link |
as some of the things you regret
link |
or not regret necessarily, but some things to think about.
link |
If you could go back and you could fix stuff about NumPy
link |
or both sort of in a personal level,
link |
but also like looking forward,
link |
what kind of things would you like to see changed?
link |
So I think there's technical questions
link |
and social questions right there.
link |
First of all, you know, I wrote NumPy as a service
link |
and I spent a lot of time doing it.
link |
And then other people came help make it happen.
link |
NumPy succeeded because the work of a lot of people, right?
link |
So it's important to understand that.
link |
I'm grateful for the opportunity,
link |
the role I had, I could play
link |
and grateful that things I did had an impact,
link |
but they only had the impact they had
link |
because the other people that came to the story.
link |
And so they were essential,
link |
but the way data types were handled,
link |
the way data types, we had array scalers, for example,
link |
that are really just a substitute for a type concept, right?
link |
So we had array scalers or actual Python objects
link |
so that there's for every, for a 32 bit float
link |
or a 16 bit float or a 16 bit integer,
link |
Python doesn't have a natural,
link |
it's just one integer, there's one float.
link |
Well, what about these lower precision types,
link |
these larger precision types?
link |
So we had them in NumPy
link |
so that you could have a collection of them,
link |
but then have an object in Python that was one of them.
link |
And there's questions about like in retrospect,
link |
I wouldn't have created those
link |
if it improved the type system.
link |
And like made the type system actually a Python type system
link |
as opposed to currently,
link |
it's a Python one level type system.
link |
I don't know if you know the difference
link |
between Python one, Python two,
link |
it's kind of technical, kind of depth,
link |
but Python two, one of its big things that Guido did,
link |
it was really brilliant.
link |
It was the actually Python one,
link |
all classes, new objects were one.
link |
If you as a user wrote a class,
link |
it was an instance of a single Python type
link |
called the class type, right?
link |
In Python two, he used a meta typing hook
link |
to actually go, oh, we can extend this
link |
and have users write classes that are new types.
link |
So he was able to have your user classes be actual types
link |
and the Python type system got a lot more rich.
link |
I barely understood that at the time that NumPy was written.
link |
And so I essentially in NumPy created a type system
link |
that was Python one era.
link |
It was every D type is an instance of the same type
link |
as opposed to having new D types be really just Python types
link |
with additional metadata.
link |
What's the cost of that?
link |
Is it efficiency, is it usability?
link |
It's usability primarily.
link |
The cost isn't really efficiency.
link |
It's the fact that it's clumsy to create new types.
link |
And then one of the challenges,
link |
you wanna create new types.
link |
You wanna quaternion type or you wanna add a new posit type
link |
or you wanna, so it's hard.
link |
And now, if we had done that well,
link |
when Numba came on the scene
link |
where we could actually compile Python code,
link |
it would integrate with that type system much cleaner.
link |
And now all of a sudden you could do gradual typing
link |
You could actually have Python when you add Numba
link |
plus better typing, could actually be a,
link |
you'd smooth out a lot of rough edges.
link |
But there's already, there's like,
link |
but are you talking about from the perspective
link |
of developers within NumPy or users of NumPy?
link |
Developers of new, not really users of NumPy so much.
link |
It's the development of NumPy.
link |
So you're thinking about like how to design NumPy
link |
so that it's contributors.
link |
Yeah, the contributors, it's easier.
link |
It's less work to make it better and to keep it maintained.
link |
And where that's impacted things, for example,
link |
Like all of a sudden GPUs start getting added
link |
and we don't have them in NumPy.
link |
Like NumPy should just work on GPUs.
link |
The fact that we'd have to download a whole other object
link |
called Kupy to have arrays on GPUs
link |
is just an artifact of history.
link |
Like there's no fundamental reason for it.
link |
Well, that's really interesting.
link |
If we could sort of go on that tangent briefly
link |
is you have PyTorch and other libraries like TensorFlow
link |
that basically tried to mimic NumPy.
link |
Like you've created a sort of platonic form
link |
of multi dimension. Basically, yeah.
link |
Well, and the problem was I didn't realize that.
link |
Platonic form has a lot of edges.
link |
They're like, well, we should cut those out
link |
before we present it.
link |
So I wonder if you can comment,
link |
is there like a difference between their implementations?
link |
Do you wish that they were all using NumPy
link |
or like in this abstraction of GPU?
link |
And sorry to interrupt that there's GPUs, ASICs.
link |
There might be other neuromorphic computing.
link |
There might be other kind of,
link |
or the aliens will come with a new kind of computer.
link |
Like an abstraction that NumPy should just operate nicely
link |
over the things that are more and more
link |
and smarter and smarter with this multi dimensional arrays.
link |
There's several comments there.
link |
We are working on something now called data dash APIs.org.
link |
Data dash API.org, you can go there today.
link |
And it's our answer.
link |
It's me and Rolf and Athen and Aaron
link |
and a lot of companies are helping us at Quansight Labs.
link |
It's not unifying all the arrays.
link |
It's creating an API that is unified.
link |
So we do care about this
link |
and we're trying to work through it.
link |
I actually had the chance to go and meet
link |
with the TensorFlow team and the PyTorch team
link |
and talk to them after exiting Anaconda.
link |
Just talking about,
link |
because the first year after leaving Anaconda in 2018,
link |
I became deeply aware of this and realized that,
link |
oh, this split in the array community that exists today
link |
makes what I was concerned about in 2005 pretty parochial.
link |
It's a lot worse, right?
link |
Now there's a lot more people.
link |
So perhaps the industry can sustain more stacks, right?
link |
There's a lot of money,
link |
but it makes it a lot less efficient.
link |
I mean, but I've also learned to appreciate,
link |
it's okay to have some competition.
link |
It's okay to have different implementations,
link |
but it's better if you can at least refactor some parts.
link |
I mean, you're gonna be more efficient
link |
if you can refactor parts.
link |
It's nice to have competition over things,
link |
over what is nice to have competition.
link |
They're innovative.
link |
And then maybe on the infrastructure,
link |
whatever, however you define infrastructure,
link |
that maybe it's nice to have come together.
link |
And I think, but it was interesting to hear the stories.
link |
I mean, TensorFlow came out of a C++ library,
link |
Jeff Dean wrote, I think,
link |
that was basically how they were doing inference, right?
link |
And then they realized, oh,
link |
we could do this TensorFlow thing.
link |
That C++ library, then what was interesting to me
link |
was the fact that both Google and Facebook did not,
link |
it's not like they supported Python or NumPy initially.
link |
They just realized they had to.
link |
They came to this world and then all the users were like,
link |
hey, where's the NumPy interface?
link |
Oh, and then they kind of came late to it
link |
and then they had these bolt ons.
link |
TensorFlow's bolt on, I don't mean to offend,
link |
but it was so bad.
link |
It's the first time that I'm usually,
link |
I mean, one of the challenges I have
link |
is I don't criticize enough in the sense
link |
that I don't give people input enough, you know, if.
link |
I think it's universally agreed upon
link |
that the bolt ons on TensorFlow were.
link |
But I went to, it was a talk given at Mallorca in Spain
link |
and a great guy came and gave a talk and I said,
link |
you should never show that API again
link |
at a PyData conference.
link |
Like that was, that's terrible.
link |
Like you're taking this beautiful system we've created
link |
and like you're corrupting all these poor Python people,
link |
forcing them to write code like that
link |
or thinking they should.
link |
Fortunately, you know, they adopted Keras as their,
link |
and Keras is better.
link |
And so Keras, TensorFlow is fine, is reasonable,
link |
but they bolted it on.
link |
Like Facebook had their own C++ library for doing inference
link |
and they also had the same reaction, they had to do this.
link |
One big difference is Facebook,
link |
maybe because of the way it's situated in part of fair,
link |
part of the research library,
link |
TensorFlow is definitely used and, you know,
link |
they have to make, they couldn't just open it up
link |
and let the community, you know, change what that is.
link |
Cause I guess they were worried
link |
about disrupting their operations.
link |
Facebook's been much more open to having community input
link |
on the structure itself.
link |
Whereas Google and TensorFlow,
link |
they're really eager to have community users,
link |
people use it and build the infrastructure,
link |
but it's much more walled.
link |
Like it's harder to become a contributor to TensorFlow.
link |
And it's also, this is very difficult question to answer
link |
and don't mean to be throwing shade at anybody,
link |
but you have to wonder, it's the Microsoft question
link |
of when you have a tool like PyTorch or TensorFlow,
link |
how much are you tending to the hackers
link |
and how much are you tending to the big corporate clients?
link |
So like the ones that,
link |
do you tend to the millions of people
link |
that are giving you almost no money,
link |
or do you tend to the few
link |
that are giving you a ton of money?
link |
I tend to stand with the people.
link |
Cause I feel like if you nurture the hackers,
link |
you will make the right decisions in the longterm
link |
that will make the companies happy.
link |
I lean that way too.
link |
But then you have to find the right dance.
link |
But it's a balance.
link |
Cause you can lean to the hackers and run out of money.
link |
Which has been some of the challenge I've faced
link |
in the sense that,
link |
like I would look at some of the experiments,
link |
like NumPy, the fact that we have this split
link |
is a factor of I wasn't able to collect more money
link |
towards NumPy development.
link |
I mean, I didn't succeed in the early days
link |
of getting enough financial contribution to NumPy
link |
so that they could work on it.
link |
I couldn't work on it full time.
link |
I had to just catch an hour here, an hour there.
link |
And I basically not liked that.
link |
Like I've wanted to be able to do something about that
link |
for a long time and try to figure out how,
link |
well, there's lots of ways.
link |
I mean, possibly one could say,
link |
we had an offer from Microsoft
link |
at early days of Anaconda.
link |
2014, they offered to come buy us, right?
link |
The problem was the right people at Microsoft
link |
didn't offer to buy us.
link |
And they were still,
link |
they were, it was really a,
link |
we were like a second,
link |
they had really bought, they just bought R,
link |
the R company called,
link |
it was not R studio,
link |
but it was another R company that was emergent.
link |
And it was kind of a,
link |
well, we should also get a Python play,
link |
but they were really doubling down on R.
link |
And so it was like,
link |
it was where you would go to die.
link |
So it's not, it wasn't,
link |
it was before Satya was there.
link |
Satya had just started.
link |
And the offer was coming from someone
link |
two levels down from him.
link |
And if it had come from Scott Guthrie,
link |
so I got a chance to meet Scott Guthrie,
link |
great guy, I like him.
link |
If an offer had come from him,
link |
probably would be at Microsoft right now.
link |
That'd be fascinating.
link |
That would be really nice actually,
link |
especially given what Microsoft has since done
link |
for the open source community and all those things.
link |
Yes, I think they're doing well.
link |
I really like some of the stuff they've been doing.
link |
They're still working,
link |
and they've, you know,
link |
they've hired Guido now,
link |
and they've hired a lot of Python developers.
link |
Wait, Guido's not at Microsoft?
link |
Yeah, he works at Microsoft.
link |
Which, he retired,
link |
then he came out of retirement,
link |
and he's working now.
link |
I was just talking to him,
link |
and he didn't mention this person.
link |
I should investigate this further.
link |
Because I know he loved Dropbox,
link |
but I wasn't sure what he was doing,
link |
Well, he was kind of saying he'd retire,
link |
but, and it's literally been five years
link |
since I last sat down and really talked to Guido.
link |
Guido's a technology expert, right?
link |
He's a, so I came,
link |
I was excited because I'd finally figured out
link |
the type system for NumPy.
link |
I wanted to kind of talk about that with him,
link |
and I kind of overwhelmed him.
link |
Could you stay in that,
link |
just for a brief moment,
link |
because you're a fascinating person
link |
in the history of programming.
link |
He is a fascinating person.
link |
What have you learned from Guido
link |
about programming, about life?
link |
I've been a fan of Guido's.
link |
You know, we have a chance to talk.
link |
Some, I wouldn't say, you know,
link |
we talk all the time.
link |
He may, but we talk enough to,
link |
in fact, when I first started NumPy,
link |
one of the first things I did was I had a,
link |
I asked Guido for a meeting
link |
with him and Paul Dubois in San Mateo.
link |
And I went and met him for lunch.
link |
And basically, to say,
link |
maybe we can actually,
link |
part of the strategy for NumPy
link |
was to get it into Python 3,
link |
and maybe be part of Python.
link |
And so we talked about that.
link |
That's a cool conversation.
link |
And about that approach, right?
link |
I would have loved to be a flyer in the water.
link |
And over the years for Guido,
link |
Like, he was willing to listen to people's ideas.
link |
And over the years,
link |
now generally, you know,
link |
I'm not saying universally that's been true,
link |
but generally that's been true.
link |
So he's willing to listen.
link |
He's willing to defer.
link |
Like on the scientific side,
link |
he would just kind of defer.
link |
He didn't really always understand
link |
what we were doing.
link |
One place where he didn't enough
link |
was we missed a matrix multiply operator.
link |
Like that finally got added to Python,
link |
but about 10 years later than it should have.
link |
But the reason was because nobody,
link |
it takes a lot of effort.
link |
And I learned this while I was writing NumPy.
link |
I also wrote tools to Python.
link |
I began with Python Dev,
link |
and I added some pieces to Python.
link |
Like the memory view object.
link |
I wanted the structure of NumPy into Python.
link |
So we didn't get NumPy into Python,
link |
but we got the basic structure of it into Python.
link |
Like, so you could build on it.
link |
Nobody did for a while,
link |
but eventually database authors started to.
link |
And it's a lot better.
link |
And also Antoine Petrou and Stefan Krah
link |
actually fixed the memory view object.
link |
Cause I wrote the underlying infrastructure in C,
link |
but the Python exposure was terrible
link |
until they came in and fixed it.
link |
Partly because I was writing NumPy,
link |
and NumPy was the Python exposure.
link |
I didn't really care about
link |
if you didn't have NumPy installed.
link |
Anyway, Guido opened up ideas,
link |
technologically brilliant.
link |
Like really, I really got a lot of respect for him
link |
when I saw what he did
link |
with this type class merger thing.
link |
It was actually tricky, right?
link |
And then willing to share, willing to share his ideas.
link |
So the other thing early on in 1998,
link |
I said, I wrote my first extension module.
link |
The reason I could is because he'd written this blog post
link |
on how to do reference counting, right?
link |
And without it, I would have been lost, right?
link |
But he was willing to at least try to write this post.
link |
And so he's been motivated early on with Python.
link |
There's a computer science for everybody.
link |
You kind of have this early on desire to,
link |
oh, maybe we should be pushing programming to more people.
link |
So he had this populist notion, I guess,
link |
or populist sense to learn that there's a certain skill,
link |
and I've seen it in other people too,
link |
of engaging with contributors sufficiently to,
link |
because when somebody engaged with you
link |
and wants to contribute to you,
link |
if you ignore them, they go away.
link |
So building that early contributor base
link |
requires real engagement with other people.
link |
And he would do that.
link |
Can you also comment on this tragic stepping down
link |
from his position as the benevolent dictator for life
link |
over the wars, you know?
link |
The Walrus operator?
link |
The Walrus operator was the last battle.
link |
I don't know if that's the cause of it,
link |
but there's this, for people who don't know,
link |
you can look up, there's the Walrus operator,
link |
which looks like a colon and equal sign.
link |
Yeah, colon, equal sign.
link |
And it actually does maybe the thing
link |
that an equal sign should be doing.
link |
Yeah, maybe, right, exactly.
link |
But it's just historically,
link |
equal sign means something else.
link |
It just means assignment.
link |
So he stepped down over this.
link |
What do you think about the pressure of leadership?
link |
It's something that, you mentioned the letter I wrote
link |
in NumPy at the time.
link |
That was a hard time, actually.
link |
I mean, there's been really hard times.
link |
You get criticized, right?
link |
And you get pushed, and you get,
link |
not everybody loves what you do.
link |
Like anytime you do anything that has impact at all,
link |
you're not universally loved, right?
link |
You get some real critics.
link |
And that's an important energy,
link |
because it's impossible for you to do everything right.
link |
You need people to be pushing.
link |
But sometimes people can get mean, right?
link |
People can, I prefer to give people the benefit of the doubt.
link |
I don't immediately assume they have bad intentions.
link |
And maybe for other, maybe that doesn't happen for everybody.
link |
For whatever reason, their past,
link |
their experiences with people, they sometimes have bad,
link |
so they immediately attribute to you bad intentions.
link |
So you're like, where did this come from?
link |
I mean, I'm definitely open to criticism,
link |
but I think you're misinterpreting the whole point.
link |
Because I would get that, certainly when I started Anaconda.
link |
Sometimes I say to people,
link |
I care enough about entrepreneurship
link |
to make some open source people uncomfortable.
link |
And I care enough about open source
link |
to make investors uncomfortable.
link |
So I sort of, you create kind of doubters on both sides.
link |
So when you have, and this is just a plea
link |
to the listener and the public, I've noticed this too,
link |
that there's a tendency, and social media makes this worse,
link |
when you don't have perfect information about the situation,
link |
you tend to fill the gaps with the worst possible,
link |
or at least a bad story that fills those gaps.
link |
And I think it's good to live life,
link |
maybe not fully naively, but filling in the gaps
link |
with the good, with the best, with the positive,
link |
with the hopeful explanation of why you see this.
link |
So if you see somebody like you trying to make money
link |
on a book about an umpire,
link |
there's a million stories around that that are positive.
link |
And those are good to think about,
link |
to project positive intent on the people.
link |
Because for many reasons, usually because people are good
link |
and they do have good intent.
link |
And also when you project that positive intent,
link |
people will step up to that too.
link |
It's a great point.
link |
It has this kind of viral nature to it.
link |
And of course with Twitter, early on figured out,
link |
and Facebook is that they can make a lot of money
link |
and engagement from the negative.
link |
So there's this, we're fighting this mechanism.
link |
Which is challenging.
link |
It's just easier to be.
link |
And then for some reason, something in our minds
link |
really enjoys sharing that and getting all excited
link |
about the negativity.
link |
Some protective mechanism perhaps that we're gonna get eaten
link |
if we don't, yeah.
link |
For us to be effective as a group of people
link |
in a software engineering project,
link |
you have to project positive intent, I think.
link |
And I think that's very,
link |
and so that happens in this space.
link |
But Python has done a reasonable job in the past,
link |
but here is a situation where I think it started
link |
to get this pressure where it didn't.
link |
I really didn't, I didn't know enough about what happened.
link |
I've talked to several people about it.
link |
And I know most of the steering committee members today,
link |
one person nominated me for that role,
link |
but it's the wrong role for me right now, right?
link |
I have a lot of respect for the Python developer space
link |
and the Python developers.
link |
I also understand the gap between computer science
link |
Python developers and array programming developers
link |
or science developers.
link |
And in fact, Python succeeds in the array space
link |
the more it has people in that boundary.
link |
And there's often very few.
link |
Like I was playing a role in that boundary
link |
and working like everything to try to keep up
link |
with even what Guido was saying, like I'm a C programmer,
link |
but not a computer scientist.
link |
Like I was an engineer and physicist and mathematician,
link |
and I didn't always understand
link |
what they were talking about
link |
and why they would have opinions the way they did.
link |
So, you know, you have to listen and try to understand.
link |
Then you also have to explain your point of view
link |
in a way they can understand.
link |
And that takes a lot of work.
link |
And that communication is always the challenge.
link |
And it's just what we're describing here
link |
about the negativity is just another form of that.
link |
Like how do we come together?
link |
And it does appear we're wired anyway
link |
to at least have a, there's a part of us
link |
that will enemy, you know, friend, enemy.
link |
And we see, yeah, it's like,
link |
why are we wiring on the enemy front?
link |
So why are we pushing that?
link |
Why are we promoting that so deeply?
link |
Assume friend until proven otherwise.
link |
So, cause you have such a fascinating mind in all of this.
link |
Let me just ask you these questions.
link |
So one interesting side on the Python history
link |
is the move from Python two to Python three.
link |
You mentioned move from Python one to Python two,
link |
but the move from Python two to Python three
link |
is a little bit interesting
link |
because it took a very long time.
link |
It broke, you know, quite a small way
link |
backward compatibility, but even that small way
link |
seemed to have been very painful for people.
link |
Is there lessons you draw?
link |
Oh man, tons of lessons.
link |
From how long it took and how painful it seemed to be?
link |
Yeah, tons of lessons.
link |
Well, I mentioned here earlier
link |
that NumPy was written in 2005.
link |
It was in 2005 that I actually went to Guido
link |
to talk about getting NumPy into Python three.
link |
Like my strategy was to,
link |
oh, we were moving to Python three.
link |
Let's have that be, and it seems funny in retrospect
link |
because like, wait, Python three,
link |
that was in 2020, right?
link |
When we finally ended the support for Python two
link |
The reason it took a long time,
link |
a lot of time, I think it was because one of the things is
link |
there wasn't much to like about Python three.
link |
3.0, 3.1, it really wasn't until 3.3.
link |
Like I consider Python 3.3 to be Python 3.0.
link |
But it wasn't until Python 3.3
link |
that I felt there's enough stuff in it
link |
to make it worth anybody using it, right?
link |
And then 3.4 started to be, oh yeah, I want that.
link |
And then 3.5 as the matrix multiply operator,
link |
and now it's like, okay, we gotta use that.
link |
Plus the libraries that started leveraging
link |
some of the features of Python three.
link |
So it really, the challenge was it was,
link |
but it also illustrated a truism that, you know,
link |
when you have inertia,
link |
when you have a group of people using something,
link |
it's really hard to move them away from it.
link |
You can't just change the world on them.
link |
And Python three, you know, made some,
link |
I think it fixed some things Guido had always hated.
link |
I don't think he didn't like the fact
link |
that print was a statement.
link |
He wanted to make it a function.
link |
But in some sense, that's a bit of gratuitous change
link |
And you could argue, and people have,
link |
but one of the challenges was there wasn't enough features
link |
and too many just changes without features.
link |
And so the empathy for the end user
link |
as to why they would switch wasn't there.
link |
I think also it illustrated just the funding realities.
link |
Like Python wasn't funded.
link |
Like it was also a project
link |
with a bunch of volunteer labor, right?
link |
It had more people, so more volunteer labor,
link |
but it was still, it was fun in the sense
link |
that at least Guido had a job.
link |
And I've learned some of the behind the scenes on that now
link |
since talking to people who have lived through it
link |
and maybe not on air, we can talk about some of that.
link |
But it's interesting to see, but Guido had a job,
link |
but his full time job wasn't just work on Python.
link |
Like he had other things to do.
link |
It is wild, isn't it?
link |
It's wild how few people are funded.
link |
And how much impact they have.
link |
Maybe that's a feature not a bug, I don't know.
link |
Maybe, yes, exactly.
link |
At least early on, like it's sort of, I know, yeah.
link |
It's like Olympic athletes are often severely underfunded,
link |
but maybe that's what brings out the greatness.
link |
Perhaps, yes, correct.
link |
Maybe this is the essential part of it.
link |
Because I do think about that in terms of,
link |
I currently have an incubator for open source startups.
link |
Like what I'm trying to do right now
link |
is create the environment I wished had existed
link |
when I was leaving academia with NumPy
link |
and trying to figure out what to do.
link |
I'm trying to create those opportunities and environments.
link |
So, and that's what drives me still,
link |
is how do I make the world easier
link |
for the open source entrepreneur?
link |
So let me stay, I mean, I could probably stay on NumPy
link |
for a long time, but this is fun question.
link |
So Andre Kapathy leads the Tesla Autopilot team,
link |
and he's also one of the most like legit programmers I know.
link |
It's like he builds stuff from scratch a lot,
link |
and that's how he builds intuition about how a problem works.
link |
He just builds it from scratch, and I always love that.
link |
And the primary language he uses is Python
link |
for the intuition building.
link |
But he posted something on Twitter saying
link |
that they got a significant improvement
link |
on some aspect of their like data loading, I think,
link |
by switching away from np.square root,
link |
so the NumPy's implementation of square root,
link |
to math.square root, and then somebody else commented
link |
that you can get even a much greater improvement
link |
by using the vanilla Python square root, which is like.
link |
And it's fascinating to me, I just wanted to.
link |
So that was some shade throwing at some.
link |
No, no, and yes, we're talking about.
link |
It's a good way to ask the trade off
link |
between usability and efficiency broadly in NumPy,
link |
but also on these specific weird quirks
link |
of like a single function.
link |
Yep, so on that point, if you use a NumPy math function
link |
on a scaler, it's gonna be slower
link |
than using a Python function on that scaler.
link |
But because the math object in NumPy is more complicated,
link |
because you can also call that math object on an array.
link |
And so effectively, it goes through a similar machine.
link |
There aren't enough of the, which you would do
link |
and you could do like checks and fast paths.
link |
So yeah, if you're basically doing a list,
link |
if you run over a list, in fact,
link |
for problems that are less than 1,000,
link |
even maybe 10,000 is probably the,
link |
if you're going more than 10,000,
link |
that's where you definitely need to be using arrays.
link |
But if you're less than that, and for reading,
link |
if you're doing a reading process
link |
and essentially it's not compute bound, it's IO bound.
link |
And so you're really taking lists of 1,000 at a time
link |
and doing work on it.
link |
Yeah, you could be faster just using Python,
link |
straight up Python.
link |
See, but also, and this is the side to the top,
link |
there's the fundamental questions
link |
when you look at the long arc of history,
link |
it's very possible that np.square root is much faster.
link |
So like in terms of like, don't worry about it,
link |
it's the evils of over optimization or whatever,
link |
all the different quotes around that,
link |
is sometimes obsessing about this particular little quark
link |
is not sufficient.
link |
For somebody like, if you're trying to optimize your path,
link |
I mean, I agree, premature optimization
link |
creates all kinds of challenges, right?
link |
Because now, but you may have to do it.
link |
I believe the quote is, it's the root of all evil.
link |
It's the root of all evil, right?
link |
Let's give Donald Knuth, I think,
link |
or is he more than somebody else?
link |
Well, Doc Knuth is kind of like Mark Twain,
link |
people just attribute stuff to him, I don't know.
link |
And it's fine because he's brilliant.
link |
So, no, I was a LaTeX user myself,
link |
and so I have a lot of respect,
link |
and he did more than that, of course,
link |
but yeah, someone I really appreciate
link |
in the computer science space.
link |
Yeah, I don't, I think that's appropriate.
link |
There's a lot of little things like that,
link |
where people actually, if you understood it,
link |
you go, yeah, of course, that's the case.
link |
And the other part, the other part I didn't mention,
link |
and Numba was a thing we wrote early on,
link |
and I was really excited by Numba
link |
because it's something we wanted,
link |
it was a compiler for Python syntax,
link |
and I wanted it from the beginning of writing NumPy
link |
because of this function question,
link |
like taking, the power of arrays
link |
is really that you can write functions using all of it.
link |
It has implicit looping, right?
link |
So you don't worry about,
link |
I write this n dimensional for loop
link |
with four loops, four, four statements.
link |
You just say, oh, big four dimensional array,
link |
I'm gonna do this operation, this plus, this minus,
link |
this reduction, and you get this,
link |
it's called vectorization in other areas,
link |
but you can basically think at a high level
link |
and get massive amounts of computation done
link |
with the added benefit of,
link |
oh, it can be paralyzed easily.
link |
It can be put in parallel.
link |
You don't have to think about that.
link |
In fact, it's worse to go decompose your,
link |
you write the for loops
link |
and then try to infer parallelism from for loops.
link |
That's actually a harder problem
link |
than to take the array problem
link |
and just automatically parallelize that problem.
link |
That's what, and so functions in NumPy
link |
are called universal functions, ufuncs.
link |
So square root is an example of a ufunk.
link |
There are others, sine, cosine, add, subtract.
link |
In fact, one of the first libraries to SciPy
link |
was something called Special
link |
where I added Bessel functions
link |
and all these special functions that come up in physics
link |
and I added them as ufuncs so they could work on arrays.
link |
So I understood ufuncs very, very well
link |
from day one inside of numeric.
link |
That was one of the things we tried to make better
link |
in NumPy was how do they work?
link |
Can they do broadcasting?
link |
What does broadcasting mean?
link |
But one of the problems is, okay,
link |
what do I do with a Python scaler?
link |
So what happens, the Python scaler gets broadcast
link |
to a zero dimensional array
link |
and then it goes through the whole same machinery
link |
as if it were a 10,000 dimensional array.
link |
And then it kind of unpacks the element
link |
and then does the addition.
link |
That's not to mention the function it calls
link |
in the case of square root
link |
is just the clib square root, right?
link |
In some cases, like Python's power,
link |
there's some optimizations they're doing
link |
that could be faster
link |
than just calling this the clib square root.
link |
In the interpreter or in the?
link |
No, in the C code, in the Python runtime.
link |
In the Python runtime, so they really optimize it
link |
and they have the freedom to do that
link |
because they don't have to worry about.
link |
It's just a scaler.
link |
It's just a scaler.
link |
Right, they don't have to worry about the fact
link |
that, oh, this could be an object with many pieces.
link |
The ufunc machine is also generic
link |
in sense that typecasting and broadcasting,
link |
broadcasting's idea of I'm gonna go,
link |
I have a zero dimensional array,
link |
I have a scaler with a four dimensional array
link |
Oh, I have to kind of coerce the shape of this guy
link |
to make it work against the whole four dimensional array.
link |
So it's the idea of I can do a one dimensional array
link |
against a two dimensional array and have it make sense.
link |
Well, that's what NumPy does is it challenges you
link |
to reformulate, rethink your problem
link |
as a multi dimensional array problem
link |
versus move away from scalers completely.
link |
Right, exactly, exactly.
link |
In fact, that's where some of the edge cases boundaries are
link |
is that, well, they're still there
link |
and this is where array scalers are particular.
link |
So array scalers are particularly bad
link |
in the sense that they were written
link |
so that you could optimize the math on them,
link |
but that hasn't happened.
link |
And so their default is to coerce the array scaler
link |
to a zero dimensional array
link |
and then use the NumPy machinery.
link |
That's what, and you could specialize,
link |
but it doesn't happen all the time.
link |
So in fact, when we first wrote Numba,
link |
we do comparisons and say, look, it's 1000X speed up.
link |
We were lying a little bit in the sense that,
link |
well, first do the 40X slowdown
link |
of using the array scalers inside of a loop.
link |
Cause if you used to use Python scalers,
link |
you'd already be 10 times faster.
link |
But then we would get a hundred times faster
link |
over that using just compilation.
link |
But what we do is compile the loop
link |
from out of the interpreter to machine code.
link |
And then that's always been the power of Python
link |
is this extensibility so that you can,
link |
cause people say, oh, Python's so slow.
link |
Well, sure, if you do all your logic
link |
in the runtime of the Python interpreter, yeah.
link |
But the power is that you don't have to.
link |
You write all the logic,
link |
what you do in the high level is just high level logic.
link |
And the actual calls you're making
link |
could be on gigabyte arrays of data.
link |
And that's all done at compiled speeds.
link |
And the fact that integration is one can happen,
link |
but two is separable.
link |
That's one of the, the language like Julia says,
link |
we're going to be all in one.
link |
You can do all of it together.
link |
And then there's, the jury's out, is that possible?
link |
I tend to think that you're going to,
link |
there's separate concerns there.
link |
You want to precompile.
link |
In fact, generally you will want to precompile your,
link |
some of your loops.
link |
Like SciPy is a compilation step.
link |
To install SciPy, it takes about two hours.
link |
If you have many machines,
link |
maybe you can get it down to one hour.
link |
But to compile those libraries takes about, takes a while.
link |
You don't want to do that at runtime.
link |
You don't want to do that all the time.
link |
You want to have this precompiled binary available
link |
that you're then just linking into.
link |
So there's real questions about the whole source code.
link |
Code is, running binary code is more than source code.
link |
It's creating object code, it's the linker, it's the loader,
link |
it's the how does that interpret it
link |
inside of virtual memory space.
link |
There's a lot of details there that actually
link |
I didn't understand for a long time
link |
until I read books on the topic.
link |
And it led to, the more you know, the better off you are
link |
and you can do more details,
link |
but sometimes it helps with abstractions too.
link |
Well, the problem, as we mentioned earlier
link |
with abstractions is you kind of sometimes assume
link |
that whoever implemented this thing
link |
had your case in mind and found the optimal solution.
link |
Or like you assume certain things.
link |
I mean, there's a lot of,
link |
One of the really powerful things to me early on,
link |
I mean, it sounds silly to say, but with Python,
link |
probably one of the reasons I fell in love with it
link |
So obviously probably most languages
link |
have some mapping concept,
link |
but it felt like it was a first class citizen
link |
and it was just my brain was able to think in dictionaries.
link |
But then there's the thing that I guess I still use
link |
to this day is order dictionaries
link |
because that seems like a more natural way
link |
to construct dictionaries.
link |
And from a computer science perspective,
link |
the running time cost is not that significant,
link |
but there's a lot of things to understand about dictionaries
link |
that the abstraction kind of
link |
doesn't necessarily incentivize you to understand.
link |
Right, do you really understand the notion of a hash map
link |
and how the dictionary is implemented?
link |
Dictionaries are a good example
link |
of an abstraction that's powerful.
link |
And I agree with you.
link |
I agree, I love dictionaries too.
link |
Took me a while to understand that once you do,
link |
you realize, oh, they're everywhere.
link |
And Python uses them everywhere too.
link |
Like it's actually constructed,
link |
one of the foundational things is dictionaries
link |
and it does everything with dictionaries.
link |
So it is, it's powerful.
link |
Order dictionaries came later,
link |
but it is very, very powerful.
link |
It took me a little while coming
link |
from just the array programming entirely
link |
to understand these other objects,
link |
like dictionaries and lists and tuples and binary trees.
link |
Like I said, I wasn't a computer scientist,
link |
I studied arrays first.
link |
And so I was very array centric.
link |
And you realize, oh, these others
link |
don't have purposes and value actually.
link |
There's a friendliness about,
link |
like one way to think about arrays
link |
is arrays are just like full of numbers,
link |
but to make them accessible to humans
link |
and make them less error prone to human users,
link |
sometimes you want to attach names,
link |
human interpretable names
link |
that are sticky to those arrays.
link |
So that's how you start to think about dictionaries
link |
is you start to convert numbers
link |
into something that's human interpretable.
link |
And that's actually the tension I've had with NumPy
link |
because I've built so much tooling
link |
around human interpretability
link |
and also protecting me from a year later
link |
not making the mistakes by being,
link |
I wanted to force myself to use English versus numbers.
link |
Yes, so there's a project called Labeled Arrays.
link |
Like very early it was recognized that,
link |
oh, we're indexing NumPy with just numbers,
link |
all the columns and particularly the dimensions.
link |
I mean, if you have an image,
link |
you don't necessarily need to label each column or row,
link |
but if you have a lot of images
link |
or you have another dimension,
link |
you'd at least like to label the dimension
link |
as this is X, this is Y, this is Z,
link |
or this is give us some human meaning
link |
or some domain specific meaning.
link |
That was one of the impetuses for Pandas actually
link |
was just, oh, we do need to label these things.
link |
And Label Array was an attempt to add
link |
that like a lighter weight version of that.
link |
And there's been, like, that's an example of something
link |
I think NumPy could add, could be added to NumPy,
link |
but one of the challenges again, how do you fund this?
link |
Like I said, one of the tragedies I think is that,
link |
so I never had the chance to,
link |
I was never paid to work on NumPy, right?
link |
So I've always just done it in my spare time,
link |
always taken from one thing,
link |
taken from another thing to do it.
link |
And at the time, I mean, today,
link |
it would be the wrong day and today,
link |
like paying me to work on NumPy now
link |
would not be a good use of effort,
link |
but we are finally at Quansight Labs,
link |
I'm actually paying people to work on NumPy and SciPy,
link |
which is I'm thrilled with, I'm excited by.
link |
I've wanted to do that.
link |
That's what I always wanted to do from day one.
link |
It just took me a while to figure out a mechanism to do that.
link |
Even like in the university setting,
link |
respecting that, like pushing students,
link |
young minds and young graduate students to contribute
link |
and then figuring out financial mechanisms
link |
that enable them to contribute
link |
and then sort of reward them
link |
for their innovative scientific journey,
link |
that would be nice.
link |
But then also just a better allocation of resources.
link |
It's 20 year anniversary since 9.11
link |
and I was just looking, we spent over $6 trillion
link |
in the Middle East after 9.11 in the various efforts there.
link |
And sort of to put politics and all that aside,
link |
it's just, you think about the education system,
link |
all the other ways we could have
link |
possibly allocated that money.
link |
To me, to take it back,
link |
the amount of impact you would have
link |
by allocating a little bit of money to the programmers
link |
that build the tools that run the world is fascinating.
link |
I don't know, I think, again,
link |
there is some aspect to being broke
link |
as somewhat of a feature, not a bug,
link |
that you make sure that you're valued.
link |
But you can still manage that.
link |
Right, no, I know.
link |
But I don't think that's a big part.
link |
So it's like, I think you can have enough money
link |
and actually be wealthy while maintaining your values.
link |
There's an old adage that nations that trade together
link |
don't go to war together.
link |
I've often thought about nations that code together.
link |
Yeah, code together.
link |
Because one of the things I love about open source
link |
is it's global, it's multinational.
link |
Like there aren't national boundaries.
link |
One of the challenges with business and open source
link |
is the fact that, well, business is national.
link |
Like businesses are entities
link |
that are recognized in legal jurisdictions, right?
link |
And have laws that are respected in those jurisdictions
link |
and hiring, and yet the open source ecosystem
link |
is not, it's not there.
link |
Like currently, one of the problems we're solving
link |
is hiring people all over the world, right?
link |
Because we, it's a global effort.
link |
And I've had the chance to work, and I've loved the chance.
link |
I've never been to like Iran,
link |
but I once had a conference
link |
where I was able to talk to people there, right?
link |
And talk to folks in Pakistan.
link |
I've never been there, but we had a call
link |
where there were people there,
link |
like just scientists and normal people.
link |
And there's a certain amount of humanizing, right?
link |
That gets away from the,
link |
like we often get the memes of society
link |
that bubble up and get discussed,
link |
but the memes are not even an accurate reflection
link |
of the reality of what people are.
link |
Well, if you look at the major power centers
link |
that are leading to something like cyber war
link |
in the next few decades,
link |
it's the United States, it's Russia, and China.
link |
And those three countries in particular
link |
have incredible developers.
link |
So if they work together, I think that's one way,
link |
the politicians can do their stupid bickering,
link |
but like there's a layer of infrastructure, of humanity.
link |
If they collaborate together,
link |
that I think can prevent major military conflict,
link |
which would, I think most likely happen at the cyber level
link |
versus the actual hot war level.
link |
You know, I think that's a good prediction.
link |
Nations that code together don't go to war together.
link |
Don't go to war together.
link |
That's a hope, right?
link |
That's one of the philosophical hopes, but yeah.
link |
So you mentioned the project of Numba,
link |
which is fascinating.
link |
So from the early days,
link |
there was kind of a pushback on Python that it's not fast.
link |
You know, you see C plus,
link |
if you wanna write something that's fast,
link |
you use C plus plus.
link |
If you wanna write something that's usable and friendly,
link |
but slow, you use Python.
link |
And so what is Numba?
link |
Yes, that's what the argument.
link |
And the reality was people would write high level coding
link |
and use compiled code,
link |
but there's still user stories, use cases,
link |
where you want to write Python,
link |
but then have it still be fast.
link |
You still need to write a for loop.
link |
Like before Numba, it was always don't write a for loop.
link |
You know, write it in a vectorized way,
link |
you know, put it in an array.
link |
And often that can make a memory trade off.
link |
Like quite often you can do it,
link |
but then you make maybe use more memory
link |
because you have to build this array of data
link |
that you don't necessarily need all the time.
link |
So Numba was, it started from a desire to have
link |
kind of a vectorized that worked.
link |
A vectorized was a tool in NumPy, it was released.
link |
You give it a Python function
link |
and it gave you a universal function,
link |
a ufunc that would work on arrays.
link |
So you get the function that just worked on a scaler.
link |
Like you could make a,
link |
like the classic case was a simple function
link |
that an if then statement in it.
link |
So sine X over X function, sync function.
link |
If X equals zero, return one, otherwise do sine X over X.
link |
The challenge is you don't want that loop
link |
peg one in Python.
link |
So you want a compiled version of that,
link |
but the ufunc, the vectorized in NumPy
link |
would just give you a Python function.
link |
So it would take the array of numbers
link |
and at every call do a loop back into Python.
link |
So it was very slow.
link |
It gave you the appearance of a ufunc,
link |
but it was very slow.
link |
So I always wanted a vectorized
link |
that would take that Python scaler function
link |
and produce a ufunc working on binary native code.
link |
So in fact, I had somebody work on that with PyPy
link |
and see if PyPy could be used to produce a ufunc like that
link |
early on in 2009 or something like that, 2010.
link |
They didn't work that well.
link |
It was kind of pretty bulky.
link |
But in 2012, Peter and I had just started Anaconda.
link |
We had, I just, I'd learned to raise money.
link |
That's a different topic,
link |
but I'd learned to raise money from friends, family,
link |
and fools, as they say.
link |
That's a good line.
link |
Oh, that's a good line.
link |
But, so we were trying to do something.
link |
We were trying to change the world.
link |
Peter and I are super ambitious.
link |
We wanted to make array computing
link |
and we had ideas for really what's still,
link |
it's still the energy right now.
link |
How do you do at scale data science?
link |
And we had a bunch of ideas there, but one of them,
link |
I had just talked to people about LLVM
link |
and I was like, there's a way to do this.
link |
I just, I went, I heard about my friend Dave Beasley
link |
at a compiler course.
link |
So I was looking at compilers like,
link |
and I realized, oh, this is what you do.
link |
And so I wrote a version of Numba
link |
that just basically mapped Python bytecode to LLVM.
link |
Right, so, and the first version is like, this works
link |
and it produces code that's fast.
link |
This is cool for, you know,
link |
obviously a reduced subset of Python.
link |
I didn't support all the Python language.
link |
There had been efforts to speed up Python in the past,
link |
but those efforts were, I would say,
link |
not from the array computing perspective,
link |
not from the perspective of wanting to produce
link |
a vectorized improvement.
link |
They were from the perspective of speeding up
link |
the runtime of Python, which is fundamentally hard
link |
because Python allows for some constructs
link |
that aren't, you can't speed up.
link |
Like it's this generic, you know, when it does this variable.
link |
So I, from the start, did not try to replicate
link |
Python's semantics entirely.
link |
I said, I'm gonna take a subset of the Python syntax
link |
and let people write syntax in Python,
link |
but it's kind of a new language really.
link |
So it's almost like four loops, like focusing on four loops.
link |
Four loops, scalar arithmetic, you know, typed,
link |
you know, really typed language, a typed subset.
link |
So, but we wanted to add inference of types.
link |
So you didn't have to spell all the types out
link |
because when you call a function,
link |
so Python is typed, it's just dynamically typed.
link |
So you don't tell it what the types are,
link |
but when it runs, every time an object runs,
link |
there's a type for the variables.
link |
You know what it is.
link |
And so that was the design goals of Numba
link |
were to make it possible to write functions
link |
that could be compiled and have them used for NumPy arrays.
link |
Like they needed to support NumPy arrays.
link |
And so how does it work?
link |
Do you add a comment within Python that tells it to do,
link |
like how do you help out the compiler?
link |
Yeah, so there isn't much actually.
link |
You don't, it's kind of magical in the sense
link |
that it just looks at the type of the objects
link |
and then it's typed inference to determine
link |
any other variables it needs.
link |
And then it was also, because we had a use case
link |
that could work early.
link |
Like one of the challenges of any kind of new development
link |
is if you have something that to make it work,
link |
it was gonna take you a long time,
link |
it's really hard to get out off the ground.
link |
If you have a project where there's some incremental story,
link |
it can start working today and solve a problem,
link |
then you can start getting it out there, getting feedback.
link |
Because Numba today, now Numba is nine years old today,
link |
the first two, three versions were not great, right?
link |
But they solved a problem and some people could try it
link |
and we could get some feedback on it.
link |
Not great in that it was very focused.
link |
Very fragile, the subset it would actually compile
link |
was small and so if you wrote Python code
link |
and said, so the way it worked is you write a function
link |
and you say at JIT, use decorators.
link |
So decorators, just these little constructs
link |
let you decorate code with an at and then a name.
link |
The at JIT would take your Python function
link |
and actually just compile it and replace the Python function
link |
with another function that interacts
link |
with this compiled function.
link |
And it would just do that and we went from Python bytecode
link |
then we went to AST.
link |
I mean, writing compilers actually,
link |
I learned a lot about why computer science
link |
is taught the way it is because compilers
link |
can be hard to write.
link |
They use tree structures, they use all the concepts
link |
of computer science that are needed.
link |
It's actually hard to, it's easy to write a compiler
link |
and then have it be spaghetti code.
link |
Like the passes become challenging
link |
and we ended up with three versions of Numba, right?
link |
Numba got written three times.
link |
What programming language is Numba written in?
link |
That's fascinating.
link |
Yeah, so Python, but then the whole goal of Numba
link |
is to translate Python bytecode to LLVM.
link |
And so LLVM actually does the code generation.
link |
In fact, a lot of times they'd say,
link |
yeah, it's super easy to write a compiler
link |
if you're not writing the parser nor the code generator.
link |
So for people who don't know, LLVM is a compiler itself.
link |
Yeah, it's really badly named low level virtual machine,
link |
which that part of it is not used.
link |
It's really low level.
link |
Chris, he doesn't mean that.
link |
But the name makes you imply that the virtual machine
link |
is what it's all about.
link |
It's actually the IR and the library,
link |
the code generation.
link |
That's the real beauty of it.
link |
The fact that, what I love about LLVM
link |
was the fact that it was a plateau you could collaborate on.
link |
Instead of the internals of GCC
link |
or the internals of the Intel compiler,
link |
or like how do I extend that?
link |
And it was a place we could collaborate.
link |
And we were early.
link |
I mean, people had started before.
link |
It's a slow compiler.
link |
Like it's not a fast compiler.
link |
So for some kind of JITs,
link |
like JITs are common in language
link |
because one, every browser has a JavaScript JIT.
link |
It does real time compilation
link |
of the JavaScript to machine code.
link |
For people who don't know, JIT is just in time compilation.
link |
Yeah, just in time compilation.
link |
They're actually really sophisticated.
link |
In fact, I got jealous of how much effort
link |
was put into the JavaScript JITs.
link |
Yes, well, it's kind of incredible what they've done.
link |
Yes, I completely agree.
link |
I'm very impressed.
link |
But you know, Numba was an effort
link |
to make that happen with Python.
link |
And so we used some of the money
link |
we raised from Anaconda to do it.
link |
And then we also applied for this DARPA grant
link |
and used some of that money to continue the development.
link |
And then we used proceeds from service projects we would do.
link |
We get consulting projects
link |
that we would then use some of the profits
link |
to invest in Numba.
link |
So we ended up with a team of two or three people
link |
It was a fits and starts, right?
link |
And ultimately, the fact that we had a commercial version
link |
of it also we were writing.
link |
So part of the way I was trying to fund Numba,
link |
say, well, let's do the free Numba
link |
and then we'll have a commercial version of Numba
link |
And what Numba Pro did is it targeted GPUs.
link |
So we had the very first CUDA JIT
link |
and the very first at JIT compiler that in 2012 for 13,
link |
you could run not just a view func on CPU,
link |
but a view func on GPUs.
link |
And it would automatically paralyze it
link |
and get 1000X speed on it.
link |
And that's an interesting funding mechanism
link |
because large companies or larger companies
link |
care about speed in just this way.
link |
So it's exactly a really good way.
link |
Yeah, there's been a couple of things
link |
you know people will pay for.
link |
One, they'll pay for really good user interfaces, right?
link |
And so I'm always looking for what are the things
link |
people will pay for that you could actually adapt
link |
to the open source infrastructure?
link |
One is definitely user interfaces.
link |
The second is speed, like a better runtime, faster runtime.
link |
And then when you say people,
link |
you mean like a small number of people pay a lot of money,
link |
but then there's also this other mechanism that.
link |
A ton of people pay.
link |
First, I gotta, we mentioned Anaconda,
link |
we mentioned friends, family, and fools.
link |
So Anaconda is yet another.
link |
So there's a company, but there's also a project.
link |
That is exceptionally impactful in terms of,
link |
for many reasons, but one of which is bringing
link |
a lot more people into the community
link |
of folks who use Python.
link |
So what is Anaconda?
link |
What is its goals?
link |
Maybe what is Conda versus Anaconda?
link |
Yeah, I'll tell you a little bit of the history of that.
link |
Cause Anaconda, we wanted to do,
link |
we wanted to scale Python.
link |
Cause we, you know, that was the goal.
link |
Peter and I had the goal of when we started Anaconda,
link |
we actually started as Continuum Analytics
link |
was the name of the company that started.
link |
It got renamed Anaconda in 2015.
link |
But we said, we want to scale analytics.
link |
NumPy is great, Pandas is emerging,
link |
but these need to run at scale with lots of machines.
link |
The other thing we wanted to do was make user interfaces
link |
We wanted to make sure the web did not pass
link |
by the Python community.
link |
That we had ways to translate your data science to the web.
link |
So those are the two kind of technical areas.
link |
We thought, oh, we'll build products in this space.
link |
And that was the idea.
link |
Very quickly in, but of course,
link |
the thing I knew how to do was to do consulting
link |
to make money and to make sure my family and friends
link |
and fools that had invested didn't lose their money.
link |
So it's a little different
link |
than if you take money from a venture fund.
link |
If you take money from a venture fund,
link |
the venture fund, they want you to go big or go home.
link |
And they're kind of like expecting nine out of 10 to fail
link |
or 99 out of 100 to fail.
link |
I was, I was owed a barbell strategy.
link |
I was like, I can't fail.
link |
I mean, I may not do super well,
link |
but I cannot lose their money.
link |
So I'm going to do something I know can return a profit,
link |
but I want to have exposure to an upside.
link |
So that's what happened at Anaconda.
link |
We didn't, there was lots of things we did not well
link |
in terms of that structure.
link |
And I've learned from since and how to do it better.
link |
But we've, we did a really good job
link |
of kind of attracting the interest around the area
link |
to get good people working
link |
and then get funnel some money
link |
on some interesting projects.
link |
Super excited about what came out of our energy there.
link |
So what are some of the interesting projects?
link |
So Dask, Numba, Bokeh, Conda.
link |
There was a data shader, Panel, Holoviz.
link |
These are all tools that are extremely relevant
link |
in terms of helping you build applications,
link |
build tools, build, you know, faster code.
link |
There's a couple I'm forgetting.
link |
Oh, JupyterLab, JupyterLab came out of this too.
link |
Okay, so Bokeh does plotting?
link |
Bokeh does plotting.
link |
So Bokeh was one of the foundational things to say,
link |
I want to do plot in Python,
link |
but have the things show up in a web.
link |
Right, that's right.
link |
That's right, that's right.
link |
And plotting to me still,
link |
with all due respect to Matplotlib and Bokeh,
link |
it feels like still an unsolved problem,
link |
not a solved problem.
link |
It is, it's a big problem.
link |
Right, because you're, I mean, I don't know,
link |
it's visualization broadly, right?
link |
I think we've got a pretty good API story
link |
around certain use cases of plotting.
link |
But there's a difference between static plots
link |
versus interactive plots versus I'm an end user,
link |
I just want to write a simple,
link |
for Pandas started the idea of here's a data frame
link |
on a dot plot, I'm just going to attach plot
link |
as a method to my object,
link |
which was a little bit controversial, right?
link |
But works pretty well, actually,
link |
because there's a lot less you have to pass in, right?
link |
You can just say, here's my object, you know what you are,
link |
you tell the visualization what to do.
link |
So that, and there's things like that
link |
that have not been super well developed entirely,
link |
but Bokeh was focused on interactive plotting.
link |
So you could, it's a short path
link |
between interactive plotting and application,
link |
dashboard application.
link |
And there's some incredible work that got done there, right?
link |
And it was a hard project,
link |
because then you're basically doing JavaScript and Python.
link |
So we wanted to tackle some of these hard problems
link |
and try to just go after them.
link |
We got some DARPA funding to help,
link |
and it was super helpful, funny story there,
link |
we actually did two DARPA proposals,
link |
but one we were five minutes late for.
link |
And DARPA has a very strict cutoff window.
link |
And so I, we had two proposals,
link |
one for the Bokeh and one for actually Numba
link |
and the other work.
link |
Which one were you late for?
link |
The Foundation on Numerical Work.
link |
So Bokeh got funded. Oh no.
link |
Fortunately, Chris let us use some of the money to fund
link |
still some of the other foundational work,
link |
but it wasn't as, yeah, his hands were tired,
link |
he couldn't do anything about it.
link |
That was a whole interesting story.
link |
So one of the incredible projects
link |
that you worked on is Conda.
link |
So what is Conda? So how that came about,
link |
yeah, Conda, it was early on, like I said, with SciPy.
link |
SciPy was a distribution mass generation library.
link |
And he said, he heard me talking about compiler issues
link |
and trying to get the stuff shipped
link |
and the fact that people can use your libraries
link |
So for a long time,
link |
we'd understood the packaging problem in Python.
link |
And one of the first things he did at Conda Analytics
link |
became Anaconda was organize the Pi data ecosystem
link |
in conjunction with NumFocus.
link |
We actually started NumFocus
link |
with some other folks in the community
link |
the same year we started Anaconda.
link |
I said, we're gonna build a corporation,
link |
but we're also gonna reify the community aspect
link |
and build a nonprofit.
link |
So we did both of those.
link |
Can we pause real quick and can you say what is PyPy,
link |
the Python package index,
link |
like this whole story of packaging in Python?
link |
Yeah, that's what I'm gonna get to actually.
link |
This is exactly the journey I'm on.
link |
It's to sort of explain packaging in Python.
link |
I think it's best expressed to the conversation
link |
I had with Guido at a conference,
link |
where I said, so packaging is kind of a problem.
link |
And Guido said, I don't ever care about packaging.
link |
I don't install new libraries.
link |
I'm like, I guess if you're the language creator
link |
and if you need something, you just put it in the distribution
link |
maybe you don't worry about packaging.
link |
But Guido has never really cared about packaging, right?
link |
And never really cared about the problem of distribution.
link |
It's somebody else's problem.
link |
And that's a fair position to take, I think,
link |
as a language creator.
link |
In fact, there's a philosophical question about
link |
should you have different development packaging managers?
link |
Should you have a package manager per language?
link |
Is that really the right approach?
link |
I think there are some answers of
link |
it is appropriate to have development tools.
link |
And there's an aspect of a development tool
link |
that is related to packaging.
link |
And every language should have some story there
link |
to help their developers create.
link |
So you should have language specific development tools.
link |
Development tools that relate to package managers.
link |
But then there's a very specific user story
link |
around package management
link |
that those language specific package managers
link |
have to interact with.
link |
And currently aren't doing a good job of that.
link |
That was one of the challenges
link |
that not seeing that difference,
link |
and it still exists in the difference today.
link |
Conda always was a user.
link |
I'm gonna use Python to do data science.
link |
I'm gonna use Python to do something.
link |
How do I get this installed?
link |
It was always focused on that.
link |
So it didn't have a develop.
link |
Classic example is pip has a pip develop.
link |
It's like, I wanna install this
link |
into my current development environment today.
link |
Conda doesn't have that concept
link |
because it's not part of the story.
link |
For people who don't know,
link |
pip is a Python specific package manager.
link |
That's exceptionally popular.
link |
That's probably like the default thing you've learned.
link |
It's the default user.
link |
And so the story there emerged
link |
because what happened is in 2012,
link |
we had this meeting at the Googleplex
link |
and Guido was there to come talk about what we're gonna do,
link |
how we're gonna make things work better.
link |
And Wes McKinney, me, Peter,
link |
Peter has a great photo of me talking to Guido
link |
and he pretends we're talking about this story.
link |
Maybe we were, maybe we weren't.
link |
But we did at that meeting talk about it
link |
and asked Guido, we need to fix packaging in Python.
link |
People can't get the stuff.
link |
And he said, go fix it yourself.
link |
I don't think we're gonna do it.
link |
The origin story right there.
link |
All right, you said, okay, you said to do this ourselves.
link |
So at the same time,
link |
people did start to work on the packaging story in Python.
link |
It just took a little longer.
link |
So in 2012, kind of motivated
link |
by our training courses we were teaching,
link |
like very similar to what you just mentioned
link |
about your mother.
link |
Like it was motivated by the same purpose.
link |
Like how do we get this into people's hands?
link |
It's this big, long process.
link |
It takes too expensive.
link |
It was actually hurting NumPy development
link |
because I would hear people were saying,
link |
don't make that change to NumPy
link |
because I just spent a week getting my Python environment.
link |
And if you change NumPy, I have to reinstall everything.
link |
And reinstalling is such a pain, don't do it.
link |
I'm like, wait, okay.
link |
So now we're not making changes to a library
link |
because of the installation problem
link |
that it'll cause for end users.
link |
Okay, there's a problem with installation.
link |
We gotta fix this.
link |
So we said, we're gonna make a distribution in Python.
link |
And we'd previously done that.
link |
I'd previously done that at mthought.
link |
I wanted to make one that would give away for free,
link |
that everyone could just get.
link |
Like that was critical that we could just get it.
link |
It wasn't tied to a product.
link |
It was just you could get it.
link |
And then we had constantly thought about,
link |
well, do we just leverage RPM?
link |
But the challenge had always been,
link |
we want a package manager that works on Windows,
link |
Mac OS X, and Linux the same, right?
link |
And it wasn't there.
link |
Like you don't have anything like that.
link |
And for people who don't know,
link |
RPM is an operating system specific package manager.
link |
Correct, it's an operating specific.
link |
So do you create the design questions,
link |
do you create an umbrella package manager
link |
that works across operating systems?
link |
Yes, that was the decision.
link |
And in neighboring design questions,
link |
do you also create a package manager
link |
that spans multiple programming languages?
link |
That was the world we faced.
link |
And we decided to go multiple operating systems,
link |
multiple and programming language independent.
link |
Because even Python, and particularly what was important
link |
was SciPy has a bunch of Fortran in it, right?
link |
And scikit learn has links to a bunch of C++.
link |
There's a lot of compiled code.
link |
And the Python package managers, especially early on,
link |
didn't even support that.
link |
So in 2000, so we released Anaconda,
link |
which was just a distribution of libraries,
link |
but we started to work on Conda in 2012.
link |
First version of Conda came out in early 2013,
link |
summer of 2013, and it was a package manager.
link |
So you could say, Conda install scikit learn.
link |
In fact, scikit learn was a fantastic project that emerged.
link |
It was the classic example of the scikits.
link |
I talked to you earlier about SciPy being too big
link |
to be a single library.
link |
Well, what the community had done is said,
link |
let's make scikits.
link |
And there's scikit image, there's scikit learn,
link |
there's a lot of scikits.
link |
And it was a fantastic move that the community did.
link |
I was like, okay, that's a good idea.
link |
I didn't like the name.
link |
I didn't like the fact you typed scikit image.
link |
I was like, that's gotta be simpler.
link |
That's scikit learn, we gotta make that smaller.
link |
I don't like typing all this stuff from imports.
link |
So I was kind of a pressure that way,
link |
but I love the energy and love the fact
link |
that they went out and they did it,
link |
and DOS people, Jared Millman, and then of course, Gael,
link |
and there's people I'm not even naming.
link |
Scikit learn really emerged as a fantastic project.
link |
And the documentation around that is also incredible.
link |
And the documentation was incredible, exactly.
link |
I don't know who did that, but they did a great job.
link |
A lot of people in Inria, a lot of European contributors.
link |
There's some Andreas in the US.
link |
There's a lot of just people I just adore,
link |
I think are amazing people.
link |
Awesome use of SciPy, right?
link |
I love the fact that they were using SciPy effectively
link |
to do something I love, which is machine learning,
link |
but couldn't install it.
link |
Because there's so many pieces involved.
link |
So many dependencies, right?
link |
So our use case of Conda was Conda install scikit learn.
link |
Right, and it was the best way to install scikit learn
link |
in 2013 to really 2018, 17, 18, PIP finally caught up.
link |
I still think it's you should Conda install scikit learn
link |
for the PIP install scikit learn,
link |
but you can PIP install scikit learn.
link |
The issue is the package they created was wheels
link |
and PIP does not handle the multi vendor approach.
link |
They don't handle the fact you have C++ libraries
link |
you're depending on.
link |
They just stop at the Python boundary.
link |
And so what you have to do in the wheel world
link |
is you have to vendor.
link |
You have to take all of the binary and vendor it.
link |
Now, if your change happens in underlying dependency,
link |
you have to redo the whole wheel.
link |
So TensorFlow, as you know,
link |
you should not PIP install TensorFlow.
link |
It's a terrible idea.
link |
People do it because the popularity of PIP,
link |
many people think, oh, of course,
link |
that's how I install everything in Python.
link |
Yeah, this is one of the big challenges.
link |
You take a GitHub repository or just a basic blog post.
link |
The number of time PIP is mentioned over Conda
link |
is like 100 X to one.
link |
So it just has to do with the.
link |
And that was increasing.
link |
It wasn't true early because PIP didn't exist.
link |
Like Conda came first.
link |
So but that's the problem.
link |
Like Conda came first, but that's like the long tail
link |
of the internet documentation user generated.
link |
So that like you think, how do I install Google?
link |
How do I install TensorFlow?
link |
You're just not gonna see Conda in that first page.
link |
Not today, you would have in 2016, 2017.
link |
And it's sad because Conda solves
link |
a lot of usability issues.
link |
Like for especially super challenging thing.
link |
One of the big pain points for me was
link |
just on the computer vision side, OpenCV installation.
link |
I think Conda, I don't know if Conda solved that one.
link |
Conda has an OpenCV package.
link |
I certainly know PIP has not solved.
link |
I mean, there's complexities there because.
link |
I actually don't know.
link |
I should probably know a good answer for this,
link |
but if you compile OpenCV with certain dependencies,
link |
you'll be able to do certain things.
link |
So there's this kind of flexibility of what you,
link |
like what options you compile with.
link |
And I don't think it's trivial to do that with Conda or.
link |
So Conda has a notion of variance of a package.
link |
You can actually have different compilation versions
link |
So not just the version is different,
link |
but oh, this is compiled with these optimizations on.
link |
So Conda does have an answer.
link |
Has those flavors.
link |
Has flavors, basically.
link |
Well, PIP, as far as I know, does not have flavors.
link |
PIP generally hasn't thought deeply
link |
about the binary dependency problem, right?
link |
And that's why fundamentally it doesn't work
link |
for the SciPy ecosystem.
link |
It barely, you can sort of paper over it and duct tape
link |
and it kind of works until it doesn't
link |
and it falls apart entirely.
link |
So it's been a mixed bag.
link |
Like, and I've been having lots of conversations
link |
with people over the years because again,
link |
it's an area where if you understand some things,
link |
but not all the things,
link |
but they've done a great job of community appeal.
link |
This is an area where I think Anaconda as a company
link |
needed to do some things
link |
in order to make Conda more community centric, right?
link |
And this is a, I talk about this all the time.
link |
There's a balance between you have every project starts
link |
with what I called company backed open source.
link |
Even if the company is yourself, it's just one person,
link |
just doing business as.
link |
But ultimately for products to succeed virally
link |
and become massive influencers,
link |
they have to create,
link |
they have to get community people on board.
link |
They have to get other people on board.
link |
So it has to become community driven.
link |
And a big part of that is engagement with those people.
link |
Empowering people, governance around it.
link |
And what happened with Conda in the early days,
link |
PIP emerged and we did do some good things.
link |
Conda Forge, Conda Forge community
link |
is sort of the community recipe creation community.
link |
But Conda itself, I still believe,
link |
and Peter is CEO of Anaconda, he's my co founder.
link |
I ran Anaconda until 2017, 2018.
link |
Is Peter still Anaconda?
link |
Peter's still Anaconda, right?
link |
And we're still great friends.
link |
We talk all the time.
link |
I love him to death.
link |
There's a long story there about like why and how
link |
and we can cover in some other podcast perhaps.
link |
It's sort of a more, maybe a more business focused one.
link |
But this is one area where I think Conda
link |
should be more community driven.
link |
Like he should be pushing more
link |
to get more community contributors to Conda
link |
and let the, Anaconda shouldn't be fighting this battle.
link |
It's actually, it's really a developers.
link |
Like you said, like help the developers
link |
and then they'll actually move us the right direction.
link |
Well, that was the problem I have is many
link |
of the cool kids I know don't use Conda.
link |
And that to me is confusing.
link |
It's really a matter of, Conda has some challenges.
link |
First of all, Conda still needs to be improved.
link |
There's lots of improvements to be made.
link |
And it's that aspect of wait, who's doing this?
link |
And the fact that then the Pi PA really stepped up.
link |
Like they were not solving the problem at all.
link |
And now they kind of got to where they're solving it
link |
for the most part.
link |
And then effectively you could get,
link |
like Conda solved a problem that was there.
link |
And it still does.
link |
It's still, you know, there's still great things it can do.
link |
But, and we still use it all the time at one site
link |
and with other clients, but with,
link |
but you can kind of do similar things with PIP and Docker.
link |
So especially with the web development community,
link |
that part of it, again, is this is the,
link |
there's a lot of different kinds of developers
link |
in the Python ecosystem.
link |
And there's still a lack of some clear understanding.
link |
I go to the Python conference all the time
link |
and then there's only a few people in the Pi PA who get it.
link |
And then others who are just massively trumpeting
link |
the power of PIP, but just do not understand the problem.
link |
So one of the obvious things to me from a mom,
link |
from a non programmer perspective,
link |
is the across operating system usability.
link |
That's much more natural.
link |
So there's people that use Windows and just,
link |
it seems much easier to recommend Conda there,
link |
but then it, you should also recommend it across the board.
link |
So I'll definitely sort of.
link |
But what I recommend now is a hybrid.
link |
I mean, I have no problem.
link |
Is it possible to use?
link |
But like build the environment with PIP, with Conda,
link |
build an environment with Conda
link |
and then PIP install on top of that.
link |
Be careful about PIP installing OpenCV or TensorFlow
link |
or because if somebody's allowed that,
link |
it's gonna be most surely done in a way
link |
that can't be updated that easily.
link |
So install like the big packages,
link |
the infrastructure with Conda and then the weirdos.
link |
That like the weird like implementation for some.
link |
I had a, there's a cool library I used
link |
that based on your location and time of day and date
link |
tells you the exact position of the sun
link |
relative to the earth.
link |
And it's just like a simple library,
link |
but it's very precise.
link |
And I was like, all right.
link |
But that was, that was, and it's like PIP.
link |
Well, the thing they did really well is Python developers
link |
who wanna get their stuff published,
link |
you have to have a PIP recipe.
link |
I mean, even if it's, you know, the challenge is,
link |
and there's a key thing that needs to be added to PIP,
link |
just simply add to PIP the ability to defer
link |
to a system package manager.
link |
Like, cause it's, you know,
link |
recognize you're not gonna solve all the dependency problem.
link |
So let like give up and allow the system package to work.
link |
That way Anaconda is installed and it has PIP.
link |
It would default to Conda to install stuff,
link |
but Red Hat RPM would default to RPM
link |
to install some more things.
link |
Like that's the, that's a key, not difficult,
link |
but somewhat work, some work feature needs to be added.
link |
That's an example of something like,
link |
I've known we need to do it.
link |
I mean, it's where I wish I had more money.
link |
I wish I was more successful in the business side,
link |
trying to get there, but I wish my, you know,
link |
my family, friends and full community that I know.
link |
Was larger and had more money.
link |
Cause I know tons of things to do effectively
link |
with more resources, but you know,
link |
I have not yet been successful at channel.
link |
Tons of, you know, some, you know,
link |
I'm happy with what we've done.
link |
We created again at Quansight,
link |
what we created to get Anaconda started.
link |
We created community to get Anaconda started.
link |
Done it again with Quansight.
link |
Super excited by that.
link |
But it took three years to do it.
link |
What is Quansight?
link |
What is its mission?
link |
We've talked a few times about different fascinating
link |
aspects of it, but let's like big picture,
link |
what is Quansight?
link |
Big picture Quansight.
link |
Quansight is, its mission is to connect data
link |
to an open economy.
link |
So it's basically consulting of the pie data ecosystem,
link |
It's a consulting company.
link |
And what I've said when I started it was we're trying
link |
to create products, people, and technology.
link |
So it's divided into two groups.
link |
And a third one as well.
link |
The two groups are a consulting services company
link |
that just helps people do data science
link |
and data engineering and data management better
link |
and more efficiently.
link |
Like full stack, like full thing.
link |
Full stack data science, full thing.
link |
We'll help you build a infrastructure.
link |
If you're using Jupiter, we need,
link |
we do staff augmentation, need more pro programmers,
link |
help you use Dask more effectively,
link |
help you use GPUs more effectively.
link |
Just basically a lot of people need help.
link |
So we do training as well to help people, you know,
link |
both immediate help and then get, learn from somebody.
link |
We've added a bunch of stuff too.
link |
We've kind of separated some of these other things
link |
into another company called Open Teams
link |
that we currently started.
link |
One of the things I loved about what we did at Anaconda
link |
was creating a community innovation team.
link |
And so I wanted to replicate that.
link |
This time we did a lot of innovation at Anaconda.
link |
I wanted to do innovation,
link |
but also contribute to the projects that existed,
link |
like create a place where maintainers,
link |
so the SciPy and NumPy and Numba
link |
and all these projects we already started
link |
can pay people to work on them and keep them going.
link |
Quansight Labs is a separate organization.
link |
It's a nonprofit mission.
link |
The profits of Quansight help fund it.
link |
And in fact, every project that we have at Quansight,
link |
a portion of the money goes directly to Quansight Labs
link |
to help keep it funded.
link |
So we've gotten several mechanisms
link |
that we keep Quansight Labs funded.
link |
And currently, so I'm really excited about Labs
link |
because it's been a mission for a long time.
link |
What kind of projects are within Labs?
link |
So Labs is working to make the software better,
link |
like make NumPy better, make SciPy better.
link |
It only works on open source.
link |
So if somebody wants to, so companies do,
link |
we have a thing called a community work order, we call it.
link |
If a company says, I wanna make Spyder better.
link |
You can pay for a month of a developer of Spyder
link |
or a developer of NumPy or a developer of SciPy.
link |
You can't tell them what you want them to do.
link |
You can give them your priorities and things you wish existed
link |
and they'll work on those priorities with the community
link |
to get what the community wants
link |
and what emerges of what the community wants.
link |
Is there some aspect on the consulting side
link |
that is helping, as we were talking about morphology
link |
and so on, is there specific application
link |
that are particularly like driving,
link |
sort of inspiring the need for updates to SciPy?
link |
Correct, absolutely, absolutely.
link |
GPUs are absolutely one of them.
link |
And new hardware beyond GPUs.
link |
I mean, Tesla's Dojo chip, I'm hoping we'll have a chance
link |
to work on that perhaps.
link |
Things like that are definitely driving it.
link |
The other thing that's driving it is scalable,
link |
like speed and scale.
link |
How do I write NumPy code or NumPy Lite code
link |
if I want it to run across a cluster?
link |
That's Dask or maybe it's Ray.
link |
I mean, there's sort of ways to do that now.
link |
Or there's Moden and there's, so Pandas code,
link |
NumPy code, SciPy code, Scikit learn code
link |
that I want to scale.
link |
So that's one big area.
link |
Have you gotten a chance to chat with Andre and Elon
link |
about particular, because like.
link |
No, I would love to, by the way.
link |
I have not, but I'd love to.
link |
I just saw their Tesla AI Days video.
link |
That's one of the, you know, I love great engineering,
link |
software engineering teams and engineering teams in general.
link |
And they're doing a lot of incredible stuff with Python.
link |
They're like revolutionary.
link |
So many aspects of the machine learning pipeline.
link |
That's operating in the real world.
link |
And so much of that is Python.
link |
Like you said, the guy running, you know, Andre Kapathy,
link |
running Autopilot is tweeting about optimization
link |
I would love to talk to him.
link |
In fact, we have at Quonset, we've been fortunate enough
link |
to work with Facebook on PyTorch directly.
link |
So we have about 13 developers at Quonset.
link |
Some of them are in labs working directly on PyTorch.
link |
On PyTorch, right.
link |
So I basically started Quonset.
link |
I went to both TensorFlow and PyTorch and said,
link |
hey, I want to help connect what you're doing
link |
to the broader SciPy ecosystem.
link |
Because I see what you're doing.
link |
We have this bigger mission that we want to make sure
link |
we don't, you know, lose energy here.
link |
So, and Facebook responded really positively
link |
and I didn't get the same reaction.
link |
So I really love the folks at TensorFlow, too.
link |
They're fantastic.
link |
I think it's the, just how it integrates
link |
with their business.
link |
I mean, like I said, there's a lot of reasons.
link |
Just the timing, the integration with their business,
link |
what they're looking for.
link |
They're probably looking for more users.
link |
And I was looking to kind of cut up some development effort
link |
and they couldn't receive that as easily, I think.
link |
So I'm hoping, I'm really hopeful
link |
and love the people there.
link |
What's the idea behind OpenTeams?
link |
So OpenTeams, I'm super excited about OpenTeams
link |
because it's one of the,
link |
I mentioned my idea for investing directly in open source.
link |
So that's a concept called fair OSS.
link |
But one of the things we, when we started Quansight,
link |
we knew we would do is we develop products and ideas
link |
and new companies might come out.
link |
At Anaconda, this was clear, right?
link |
Anaconda, we did so much innovation
link |
that like five or six companies could have come out of that.
link |
And we just didn't structure it so they could.
link |
But in fact, they have, you look at Dask,
link |
there's two companies going out of Dask.
link |
You know, Bokeh could be a company.
link |
There's like lots of companies that could exist
link |
off the work we did there.
link |
And so I thought, oh, here's a recipe for an incubation,
link |
a concept that we could actually spawn new companies
link |
and new innovations.
link |
And then the idea has always been,
link |
well, money they earn should come back
link |
to fund the open source projects.
link |
So labs is, you know, I think there should be
link |
a lot of things like Quansight Labs.
link |
I think this concept is one that scales.
link |
You could have a lot of open source research labs.
link |
Along the way, so in 2018, when the bigger idea came,
link |
how to make open source investable, I said,
link |
oh, I need to write, I need to create a venture fund.
link |
So we created a venture fund called Quansight Initiate
link |
It's an angel fund, really.
link |
It's, you know, we started to learn that process.
link |
How do we actually do this?
link |
How do we get LPs?
link |
How do we actually go in this direction and build a fund?
link |
And I'm like, every venture fund should have
link |
an associated open source research lab,
link |
which is no reason.
link |
Like our venture fund, the carried interest,
link |
a portion of it goes to the lab.
link |
It directly will fund the lab.
link |
That's fascinating, brother.
link |
So you use the power of the organic formation of teams
link |
in the open source community, and then like naturally,
link |
that leads to a business that can make money.
link |
And then it always maintains and loops back
link |
to the open source.
link |
Loops back to open source, exactly.
link |
I mean, to me, it's a natural fit.
link |
There's something, there's absolutely
link |
a repeatable pattern there, and it's also beneficial
link |
because, oh, I have, I have natural connections
link |
to the open source if I have an open source research lab.
link |
Like, they'll always, they'll be out there
link |
talking to people, and so we've had a chance
link |
to talk to a lot of early stage companies.
link |
And we, and our fund focuses on the early stage.
link |
So Quansight has the services, the lab, the fund, right?
link |
In that process, a lot of stuff started to happen.
link |
They're like, oh, you know, we started to do recruiting
link |
and support and training, and I was starting
link |
to build a bigger sales team and marketing team
link |
and people besides just developers.
link |
And one of the challenges with that
link |
is you end up with different cultural aspects.
link |
You know, developers, you know, there's a,
link |
in any company you go to, you kind of go look,
link |
is this a business led company, a developer led company?
link |
Do they kind of coexist?
link |
Are they, what's the interface between them?
link |
There's always a bit of a tension there.
link |
Like we were talking about before.
link |
You know, what is the tension there?
link |
With OpenTeams, I thought, wait a minute,
link |
we can actually just create,
link |
like this concept of Quansight plus labs,
link |
it's, well, it's specific to the Pi data ecosystem.
link |
The concept is general for all open source.
link |
So OpenTeams emerged as a, oh,
link |
we can create a business development company
link |
for many, many Quansights, like thousands of Quansights.
link |
And it can be a marketplace to connect,
link |
essentially be the enterprise software company
link |
If you look at what enterprise software wants
link |
from the customer side, and during this journey,
link |
I've had the chance to work and sell to lots of companies,
link |
Exxon and Shell and Davey Morgan Bank of America,
link |
like the Fortune 100,
link |
and talk to a lot of people in procurement
link |
and see what are they buying and why are they buying?
link |
So, you know, I don't know everything,
link |
but I've learned a lot about,
link |
oh, what are they really looking for?
link |
And they're looking for solutions.
link |
They're constantly given products
link |
from enterprise software.
link |
Here's open source, leave the enterprise software,
link |
And then they have to stitch it together into a solution.
link |
Open source is fantastic for gluing
link |
those solutions together.
link |
So, whereas they keep getting new platforms
link |
they're trying to buy,
link |
but most open source, what most enterprises want
link |
is tools that they can customize
link |
that are as inexpensive as they can.
link |
Yeah, and so you always want to maintain
link |
the connection to the open source
link |
because that's going to be the tools.
link |
Yes, so open teams is about solving
link |
enterprise software problems.
link |
Brilliant, brilliant idea, by the way.
link |
With a connect, but we do it honoring the topology.
link |
We don't hire all the people.
link |
We are a network connecting the sales energy
link |
and the procurement energy,
link |
and we work on the business side,
link |
get the deals closed,
link |
and then have a network of partners
link |
like Quonsight and others who we hand the deals to,
link |
to actually do the work.
link |
And then we have to maintain,
link |
I feel like we have to maintain
link |
some level of quality control
link |
so that the client can rely on open teams
link |
to ensure the delivery.
link |
It's not just, here's a lead, go figure that out.
link |
But no, we're going to make sure you get what you need.
link |
By the way, it's such a skill,
link |
and I don't know if I have the patience.
link |
I will have the patience to talk to the business people
link |
or more specific, I mean,
link |
there's all kinds of flavors of business people
link |
or like marketing people.
link |
There's a challenge.
link |
I hear what you're saying
link |
because I've had the same challenge.
link |
There's sometimes you think, okay, this is way overwrought.
link |
Yeah, but you have to become an adult
link |
and you have to, because the companies have needs.
link |
They have ways to make money
link |
and they also want to learn and grow,
link |
and it's your job to kind of educate them on the best way,
link |
like the value of open source, for example.
link |
Right, and I'm really grateful for all my experiences
link |
over the past 14 years, understanding that side of it
link |
and still learning for sure,
link |
but not just understanding from companies,
link |
but also dealing with marketing professionals
link |
and sales professionals
link |
and people that make a career out of that
link |
and understanding what they're thinking about
link |
and also understanding, well, let's make this better.
link |
We can really make a place.
link |
Open teams I see as the transmission layer
link |
between companies and open source communities
link |
producing enterprise software solutions.
link |
Eventually we want to,
link |
today we're taking on SaaS and MATLAB
link |
and tools that we know we can replace for folks.
link |
Really, anytime you have a software tool at an organization
link |
where you have to do a lot of customization
link |
to make it work for you.
link |
It's not you're just buying this thing off the shelf
link |
It's like, okay, you buy this system
link |
and then you customize it a lot,
link |
usually with expensive consultants
link |
to actually make it work for you.
link |
All of those should be replaced by open source foundations
link |
with the same customization.
link |
You're doing such important work,
link |
such important work in these giant organizations
link |
that do exactly that,
link |
taking some proprietary software
link |
and hiring a huge team of consultants
link |
that customize it and then that whole thing
link |
gets outdated quick.
link |
And so, I mean, that's brilliant.
link |
So the one solution to that
link |
is kind of what Tesla's doing a little bit of,
link |
which is basically build up a software engineering team.
link |
Like build a team from scratch.
link |
Build a team from scratch.
link |
And companies are doing it well,
link |
that's what they're doing right now.
link |
And you're creating a topology for some of that.
link |
You just don't have to do it.
link |
That's not the only answer, right?
link |
And so other companies can access this,
link |
be more accessible.
link |
open team is the future of enterprise software.
link |
We're still early.
link |
Like this idea just percolated over the past year
link |
as we've kind of grown Quansight
link |
and realized the extensibility of it.
link |
We just finished in our seed round
link |
to help get more sales people
link |
and then push the messaging correctly.
link |
And there's lots of tools we're building
link |
to make this easier.
link |
Like we wanna automate the processes.
link |
We feel like a lot of the power
link |
is the efficiency of the sales process.
link |
There's a lot of wasted energy in small teams
link |
and the sales energy to get into large companies
link |
There's a lot of money spent on that process.
link |
Creating the tools and processes for that sales.
link |
So make that super seamless.
link |
So a single company can go,
link |
oh, I've got my contract with open teams.
link |
We've got a subscription they can get.
link |
They can make that procurement seamless.
link |
And then the fact they have access
link |
to the entire open source ecosystem.
link |
And we have a part of our work
link |
that's embracing open source ecosystems
link |
and making sure we're doing things useful for them
link |
And then companies making sure
link |
they're getting solutions they care about.
link |
And then figuring out which targets we have.
link |
We're not taking on all of open source,
link |
all of enterprise software yet.
link |
But we're step by step.
link |
Well this feels like the future.
link |
The idea and the vision is brilliant.
link |
Can I ask you, why do you think Microsoft bought GitHub
link |
and what do you think is the future of GitHub?
link |
I thought it was a brilliant move.
link |
I think they did because Microsoft has always
link |
had a developer centric culture.
link |
Like they always have.
link |
Like one of the things Microsoft's always done well
link |
is understand that their power is the developers.
link |
It's been, Ballmer didn't necessarily make a good meme
link |
about how he approached that.
link |
But they're broadening that.
link |
I think that's why.
link |
Because they recognize GitHub is where developers are at.
link |
But do they have a vision like open teams
link |
type of situation, right?
link |
I don't think so yet.
link |
Are they just basically throwing money at developers
link |
to show their support?
link |
Without a topology like you put it.
link |
Like a way to leverage that.
link |
Like to give developers actual money.
link |
They're still, it's an enterprise software company.
link |
And they make a bunch of money.
link |
They make a bunch of games.
link |
They're a big company.
link |
They sell products.
link |
I think part of it is they know there's opportunity
link |
to make money from GitHub.
link |
There's definitely a business there.
link |
You know, to sell to developers.
link |
Or to sell to people using development.
link |
I think there's part of that.
link |
I think part of it is also there's,
link |
they had definitely wanted to recognize
link |
that you need to value open source
link |
to get great developers.
link |
Which is an important concept that was emerging
link |
over the past 10 years.
link |
That, you know, pay at Pi Data.
link |
We were able to convince J.P. Morgan
link |
to support Pi Data because of that fact.
link |
That was where the money for them putting
link |
a couple hundred thousand into supporting Pi Data
link |
for several conferences was they want developers.
link |
And they realized that developers want
link |
to participate in open source.
link |
So enterprise software folks don't always understand
link |
how their software gets used.
link |
Having spent a lot of time on the floors
link |
at J.P. Morgan, at InShell, at ExxonMobil,
link |
you see, oh, these companies have large development teams.
link |
And then they're kind of dealing with
link |
what's being delivered to them.
link |
So I really feel kind of a privilege
link |
that I had a chance to learn some of these people
link |
and see what they're doing.
link |
And even work alongside them, you know,
link |
as a consultant, using open source and trying to figure,
link |
how do we make this work inside of our large organization?
link |
Some of it is actually, for a large organization,
link |
some of it is messaging to the world
link |
that you care about developers
link |
and you're the cool, you care.
link |
Like, for example, like if Ford,
link |
cause I talked to them, like car companies, right?
link |
They want to attract, you know,
link |
you want to take on Tesla and autopilot.
link |
You want to take on, right?
link |
And so what do you do there?
link |
You show that you're cool.
link |
Like you try to show off that you care about developers
link |
and they have a lot of trouble doing that.
link |
And like one way, I think like Ford should have bought GitHub.
link |
They just to show off, like these old school companies
link |
and it's in a lot of different industries.
link |
There's probably different ways.
link |
It's probably an art show that you care to developers.
link |
And the developers, it's exactly what you, like,
link |
for example, just spit balling here,
link |
but like Ford or somebody like that
link |
could give a hundred million dollars
link |
to the development of NumPy.
link |
And like literally look at like the top most popular projects
link |
in Python and just say, we're just going to give money.
link |
Like that's going to immediately make you cool.
link |
They could actually, yeah.
link |
And in fact, they set up NumFocus to make it easy.
link |
But the challenge was,
link |
is also you have to have some business development.
link |
Like it's a bit of a seeding problem, right?
link |
And you look at how,
link |
I've talked to the folks at Linux Foundation,
link |
know how they're doing it.
link |
I know how, and starting NumFocus,
link |
because we had two babies in 2012.
link |
One was Anaconda, one was NumFocus, right?
link |
And they were both important efforts.
link |
They had distinct journeys
link |
and super grateful that both existed
link |
and still grateful both exist.
link |
But there's different energies in getting donations
link |
as there is getting, this is important to my business.
link |
Like I'm selling you something that this is a,
link |
I'm going to make money this way.
link |
Like if you can tie it,
link |
if you can tie the message to an ROI for the company,
link |
it becomes a brainer.
link |
That's more effective.
link |
It's much more effective, right?
link |
So, and there are rational arguments to make.
link |
I've tried to have conversations with marketing,
link |
especially marketing departments.
link |
Like very early on, it was clear to me that,
link |
oh, you could just take a fraction of your marketing budget
link |
and just spend it on open source development.
link |
And you get better results from your marketing.
link |
How did those, can I, sorry,
link |
I'm going to try not to go and rants here.
link |
What have you learned from the interaction
link |
with the marketing folks on that kind of,
link |
because you gave a great example
link |
of something that will obviously be much better investment
link |
in terms of marketing is supporting open source projects.
link |
The challenge is not dissimilar
link |
from the challenge you have in academia
link |
or the different colleges, right?
link |
Knowledge gets very specific and very channeled, right?
link |
And so people get,
link |
they get a lot of learning in the thing they know about.
link |
And it's hard then to bridge that
link |
and to get them to think differently enough
link |
to have a sense that you might have something to offer
link |
because it's different.
link |
It's like, well, how do I implement that?
link |
How do I, what do I do with that?
link |
Like, do I, which budget do I take from?
link |
Do I slow down my spend on Google ads
link |
or my spend on Facebook ads?
link |
Or do I not hire a content creator and say like,
link |
there's an operational aspect to that,
link |
that you have to be the CMO, right?
link |
Or the CEO, you have to get the right level.
link |
So you'll have to hire at a high position level
link |
where they care about this and this.
link |
Right, or they won't know how, right?
link |
And because you can also do it very clumsily, right?
link |
And I've seen it, cause you can,
link |
you absolutely have to honor and recognize
link |
the people you're going to and the fact
link |
that if you just throw money at them,
link |
it could actually create more problems.
link |
Can I just say, this is not you saying, can I just,
link |
cause I just need, I need to say this.
link |
I've been very surprised how often marketing people
link |
are terrible at marketing.
link |
I feel like the best marketing is doing something novel
link |
and unique that anticipates the future.
link |
It feels like so much of the marketing practice
link |
is like what they took in school,
link |
or maybe they're studying for what was the best thing
link |
that was done in the past decade,
link |
and they're just repeating that over and over,
link |
as opposed to innovating, like taking the risk.
link |
That's a great point.
link |
Is taking the big risk.
link |
That's a great point.
link |
And being the first one to risk.
link |
Yeah, there's an aspect of data observation
link |
from that risk, right?
link |
That's, I think, shared what they're doing already.
link |
But it absolutely, it's about, I think it's content.
link |
Like there's this whole world on content marketing
link |
that you could almost say, well, yeah, it can get over,
link |
you can get inundated with stuff
link |
that's not relevant to you.
link |
Whereas what you're saying would be highly relevant
link |
and highly useful and highly beneficial.
link |
Yeah, but it's risk.
link |
I mean, that's why I sort of,
link |
there's a lot of innovative ways of doing that.
link |
Tesla's an example of people
link |
that basically don't do marketing.
link |
They do marketing in a very, like,
link |
let's say Elon hired a person who's just good at Twitter
link |
for running Tesla's Twitter account.
link |
I mean, that's exactly what you wanna be doing.
link |
You want it to be constantly innovating in the.
link |
Right, there's an aspect of telling.
link |
I mean, I've definitely seen people doing great work
link |
where you're not talking about it.
link |
Like, I would say that's actually a problem
link |
I have right now with Quonset Labs.
link |
Quonset Labs has been doing amazing work,
link |
really excited about it,
link |
but we have not been talking about it enough.
link |
And there's different ways to talk about it.
link |
There's different ways to,
link |
there's different channels to which to communicate.
link |
There's also, like, I'll just throw some shade
link |
at companies I love.
link |
So for example, iRobot,
link |
I just had a conversation with them.
link |
They make Roombas.
link |
And I think I love, they're incredible robots,
link |
but like every time they do like advertisement,
link |
not advertisement, but like marketing type stuff,
link |
it just looks so corporate.
link |
And to me, the incredible,
link |
maybe wrong in the case of iRobot, I don't know.
link |
But to me, when you're talking about engineering systems,
link |
it's really nice to show off the magic of the engineering
link |
and the software and all the geniuses behind this product
link |
and the tinkering and like the raw authenticity
link |
of what it takes to build that system
link |
versus the marketing people who want to have like
link |
pretty people, like standing there all pretty
link |
with the robots, like moving perfectly.
link |
So to me, there's some aspect,
link |
it's like speaking to the hackers,
link |
you have to throw some bones,
link |
some care towards the engineers, the developers,
link |
because there's some aspect, one, for the hiring,
link |
but two, there's an authenticity to that,
link |
authenticity to that kind of communication
link |
that's really inspiring to the end user as well.
link |
Like if they know that brilliant people,
link |
the best in the world are working at your company,
link |
they start to believe that that product
link |
that you're creating is really good.
link |
It's interesting, because your initial reaction would be,
link |
wait, there's different users here.
link |
Why would you do that to, you know,
link |
my wife bought a Roomba, and she loves developers,
link |
she loves me, but she doesn't care about that culture.
link |
So essentially what you said is actually the authenticity,
link |
because everyone has a friend, everyone knows people,
link |
there's word of mouth, I mean, if you.
link |
Word of mouth is so, so proper.
link |
Yeah, exactly, that's interesting.
link |
Because I think it's the lack of that realization,
link |
there's this halo effect that influences
link |
your general marketing, interesting.
link |
For some stupid reason, I do have a platform,
link |
and it seems that the reason I have a platform,
link |
many others like me, millions of others,
link |
is like the authenticity,
link |
and like we get excited naturally about stuff.
link |
And like, I don't want to get excited
link |
about that iRobot video,
link |
because it's boring, it's marketing, it's corporate,
link |
as opposed to, I wanted to do some fun,
link |
this is me, like a shout out to iRobot,
link |
is they're not letting me get into the robot.
link |
Yeah, well there's an aspect of,
link |
that could be benefiting from a culture of modularity,
link |
like add ons, and that could actually dramatically help.
link |
You've seen that over history,
link |
I mean, Apple is an example of a company like that,
link |
or the, like, I can see what your point is,
link |
is that you have something that needs to be,
link |
it needs to be adopted broadly,
link |
the concept needs to be adopted broadly.
link |
And if you want to go beyond this one device,
link |
you need to engage this community.
link |
Yeah, and connecting to the open source that you said.
link |
you're a programmer,
link |
one of the most impactful programmers ever.
link |
You've led many programmers, you lead many programmers.
link |
What are some, from a programmer perspective,
link |
what makes a good programmer?
link |
What makes a productive programmer?
link |
Is there a device you can give
link |
to be a great programmer in this world?
link |
That's a great, great question.
link |
And there are times in my life
link |
I'd probably answer this even better
link |
than I hope maybe give an answer today.
link |
Because I thought about this numerous times,
link |
like right now I've spent on so much time
link |
recently hiring salespeople that,
link |
That your mind is a little bit on something else.
link |
On something else.
link |
But I reflected on the past,
link |
and also, you know, I have some really,
link |
the only way I can do this,
link |
is I have some really great programmers that I work with,
link |
who lead the teams that they lead.
link |
And my goal is to inspire them and hopefully help them,
link |
encourage them, and be,
link |
help them encourage with their teams.
link |
I would say there's a number of things, couple things.
link |
Like you, I think a programmer without curiosity
link |
Like you'll lose interest, you won't do your best work.
link |
So it's sort of, it's an affect.
link |
It's sort of, are you,
link |
you have some curiosity about things.
link |
I think two, don't try to do everything at once.
link |
Recognize that you're, you know, we're limited as humans.
link |
You're limited as a human.
link |
And each one of us are limited in different ways.
link |
You know, we all have our different strengths and skills.
link |
So it's adapting the art of programming to your skills.
link |
One of the things that always works,
link |
is to limit what you're trying to solve.
link |
Right, so, if you're part of a team,
link |
usually maybe somebody else has put the architecture together
link |
and they've gotten given a portion for you if you're young.
link |
If you're not part of a team,
link |
it's sort of breaking down the problem into smaller parts,
link |
is essential for you to make progress.
link |
It's very easy to take on a big project
link |
and try to do it all at once, and you get lost.
link |
And then you do it badly.
link |
And so thinking about, you know,
link |
very concretely what you're doing,
link |
defining the inputs and outputs,
link |
defining what you want to get done.
link |
Even just talking about that and like writing down
link |
before you write code, just what are you trying to accomplish?
link |
I mean, very specific about it, really, really helps.
link |
I think using other people's work, right?
link |
Don't be afraid that somehow you're,
link |
like you should do it all.
link |
Like, nobody does.
link |
Stand on the shoulders of giants.
link |
And copy and paste from Stack Overflow.
link |
Copy and paste from Stack Overflow.
link |
But don't just copy and paste,
link |
this is particularly relevant in the era of Codex
link |
and the auto generated code, which is essentially,
link |
I see as an indexing of Stack Overflow.
link |
Secondly, it's like.
link |
It's a search engine.
link |
It's a search engine over Stack Overflow, basically.
link |
So it's not, I mean, we've had this for a while.
link |
But really, you want to cut and paste, but not blindly.
link |
Like, absolutely I've cut and paste to understand,
link |
but then you understand.
link |
Oh, this is what this means.
link |
Oh, this is what it's doing.
link |
And understand as much as you can.
link |
So it's critical, that's where the curiosity comes in.
link |
If you're just blindly cutting and pasting,
link |
you're not gonna understand.
link |
So understand, and then be sensitive to hype cycles.
link |
Right, every few often there's always a,
link |
oh, test driven development is the answer.
link |
Oh, object oriented is the answer.
link |
Oh, there's always an answer.
link |
Agile is the answer.
link |
Be cautious of jumping onto a hype cycle.
link |
Like, likely there's signal.
link |
Like, there's a thing there
link |
that's actually valuable, you can learn from.
link |
But it's almost certainly not the answer
link |
to everything you need.
link |
What lessons do you draw
link |
from you having created NumPy and SciPy?
link |
Like, in service of sort of answering the question
link |
of what it takes to be a great programmer
link |
and giving advice to people.
link |
How can you be the next person to create a SciPy?
link |
Yeah, so one is listen.
link |
To people that have a problem, right?
link |
Which is everybody, right?
link |
But listen, and listen to many.
link |
And then try to, and then do.
link |
Like, you're gonna have to do an experiment, you know?
link |
Do, fall down, don't be afraid to fall down.
link |
Don't be afraid, the first thing you do
link |
is probably gonna suck, and that's okay, right?
link |
It's honestly, I think iteration is the key to innovation.
link |
And it's almost that psychological hesitation we have
link |
Like, yeah, we know it's not great,
link |
but next time it'll be better.
link |
I mean, just keep learning and keep improving.
link |
So it's an attitude.
link |
And then it doesn't take intense concentration, right?
link |
Good things don't happen just,
link |
it's not quite like TikTok or like Facebook, you know?
link |
You can't scroll your way to good programming, right?
link |
There are sincere hours of deep,
link |
don't be afraid of the deep problem.
link |
Like, often people will run away from something
link |
because, oh, I can't solve this.
link |
And you might be right, but give it an hour.
link |
Give it a couple of hours and see.
link |
And just five minutes, not gonna give you that.
link |
Was it lonely when you were building SciPy and NumPy?
link |
Hugely, yeah, absolutely lonely,
link |
in the sense of you had to have an inner drive,
link |
and that inner drive for me always comes from,
link |
I have to see that this is right in some angle.
link |
I have to believe it, that this is the right approach,
link |
the right thing to do.
link |
With SciPy, it was like, oh yeah,
link |
the world needs libraries and Python.
link |
Clearly Python's popular enough
link |
with enough influential people to start,
link |
and it needs more libraries.
link |
So that is a good in and of itself.
link |
So I'm gonna go do that good.
link |
So find a good, find a thing that you know is good
link |
and just work on it.
link |
So that has to happen, and it is.
link |
And you kind of have to have enough realization
link |
of your mission to be okay with the naysayer
link |
or the fact that not everybody joins you at front.
link |
In fact, one thing I've talked to people a lot,
link |
I've seen a lot of projects come, and some fail.
link |
Not everything I've done has actually worked perfectly.
link |
I've tried a bunch of stuff that, okay,
link |
that didn't really work, or this isn't working, and why.
link |
But you see the patterns, and one of the key things is
link |
you can't even know for six months.
link |
I say 18 months right now.
link |
If you're starting a new project,
link |
you gotta give it a good 18 month run
link |
before you even know if the feedback's there.
link |
You're not gonna know in six months.
link |
You might have the perfect thing,
link |
but six months from now, it's still kind of still emerging.
link |
So give it time, because you're dealing with humans,
link |
and humans have an inertial energy
link |
that just doesn't change that quickly, so.
link |
Let me ask a silly question, but like you said,
link |
you're focused on the sales side of things currently,
link |
but back when you were actively programming,
link |
maybe in the 90s, you talked about IDEs.
link |
What's a setup that you have that brings you joy?
link |
Keyboard, number of screens, Linux.
link |
I do still like to program some.
link |
It's not as much as I used to.
link |
I have two projects I'm super interested in,
link |
trying to find funding for them,
link |
trying to figure out teams for them,
link |
but I could talk about those.
link |
But what I, yeah, I'm an Emacs guy.
link |
Great, thank the superior editor, everybody.
link |
I've got, I don't often delete tweets,
link |
but one of the tweets I deleted
link |
when I said Emacs was better than Vim,
link |
and then the hate I got from it.
link |
I was like, I'm walking away from this.
link |
I do too, I don't push it.
link |
I'm just joking, of course.
link |
Yeah, exactly, it's kind of like,
link |
but people do take the editor seriously, right?
link |
I did it as a joke.
link |
It is, but there's something beautiful to me about Emacs,
link |
but for people that love Vim,
link |
there's something beautiful to them about that.
link |
I mean, I do use Vim for quick editing.
link |
Like Command Line, if I said quick editing,
link |
I will still sometimes use it, but not much.
link |
Like it's simple, corrective signal editor character.
link |
So when you were developing SciPy, you were using Emacs?
link |
SciPy and NumPy are all written on Emacs on a Linux box.
link |
And CVS and then SVN, version control.
link |
Like Git has, I love distributed branch stuff.
link |
I think Git is pretty complicated, but I love the concept.
link |
And also, of course, GitHub and then GitLab
link |
make Git definitely consumable, but that came later.
link |
Did you ever touch Lisp at all?
link |
Like what were your emotional feelings
link |
about all the parentheses?
link |
Yeah, so great question.
link |
So I find myself appreciating Lisp today
link |
much more than I did early.
link |
Because when I came to programming, I knew programming,
link |
but I was a domain expert, right?
link |
And to me, the parentheses were in the way.
link |
It's like, wow, there's just all this,
link |
like it just gets in the way of my thinking
link |
about what I'm doing.
link |
So why would I have all these, right?
link |
That was my initial reaction to it.
link |
And now as I appreciate kind of the structure
link |
that kind of naturally maps to a logical thinking
link |
about a program, I can appreciate them, right?
link |
And why it's actually, you could create editors
link |
that make it not so problematic, right, honestly.
link |
So I actually have a much more appreciation of Lisp
link |
and things like Clojure and there's HyVee,
link |
which is a Python Lisp that compiles the Python bytecode.
link |
I think it's challenging.
link |
Like typically these languages are,
link |
I even saw the whole data science programming system
link |
in Lisp that somebody created, which is cool.
link |
But again, I think it's the lack of recognition
link |
of the fact that there exists
link |
what I call occasional programmers.
link |
People that are never gonna be programmers for a living.
link |
They don't want to have all this cuteness in their head.
link |
They want just, it's why basic, you know,
link |
Microsoft had the right idea with basic
link |
in terms of having that be the language of visual basic,
link |
the language of Excel and SQL Server.
link |
They should have converted that to Python 10 years ago.
link |
Like the world would be a better place if they had, but.
link |
There's also, there's a beauty and a magic
link |
to the history behind a language in Lisp.
link |
You know, some of the most interesting people
link |
in the history of computer science
link |
and artificial intelligence have used Lisp.
link |
Well, especially that language,
link |
when you have a language, you can think in it.
link |
And it helps you think better.
link |
And it attracts a certain kinds of people
link |
that think in a certain kind of way.
link |
And then that's there.
link |
Okay, so what about like small laptop with a tiny keyboard,
link |
or is there like three screens?
link |
You know, good question.
link |
I've never gotten into the big, many screens to be honest.
link |
I mean, and maybe it's because in my head,
link |
I kind of just, I just swap between windows.
link |
Like, partly because I guess I really can't process
link |
three screens at once anyway.
link |
Like, I just am looking at one and I just flip.
link |
You know, I flip an application open.
link |
So where it's really helpful is actually
link |
when I'm trying to do, you know,
link |
here's data and I want to input it from here.
link |
Like this is the only time I really need another screen.
link |
So now, because you're both a developer, lead developers,
link |
but then there's also these businesses
link |
and there's salespeople and you're working
link |
with large companies.
link |
Operations people, hiring people, yeah.
link |
Which operating system is your favorite at this point?
link |
So Linux was the early days.
link |
So yeah, I love Linux as a server side.
link |
And it was early days I had my own Linux desktop.
link |
I've been on Mac laptops for 10 years now.
link |
Yeah, this is what leadership looks like.
link |
As you switch to Mac.
link |
Pretty much, I mean, just the fact that I had
link |
to do PowerPoints, I had to do presentations
link |
and you know, plug in, I just couldn't mess
link |
with plugging in laptops, it wouldn't project and yeah.
link |
So you mentioned also Quantset Labs and things like that.
link |
Can you give advice on how to hire great programmers
link |
Yeah, I would say, produce an open source project,
link |
get people contributing to it and hire those people.
link |
Yeah, I mean, you're doing it sort of,
link |
you may be perhaps a little biased,
link |
but that's probably 100% really good advice.
link |
I find it hard to hire.
link |
I still find it hard to hire, like in terms of,
link |
I don't think that it's not hard to hire
link |
if I've worked with somebody for a couple of weeks,
link |
but an hour or two of interviews, I have no idea.
link |
So that instinct, that radar of knowing if you're good
link |
or not, that you've found that you're still not able to.
link |
It's really hard, I mean, the resume can help,
link |
but again, the resume is like a presentation
link |
of the things they want you to see, not the reality of,
link |
and there's also, you have to understand
link |
what you're hiring for.
link |
There are different stages and different kinds of skills.
link |
And so it isn't just, one of the things I talk a lot about
link |
internally at my company is just that the whole idea
link |
of measuring ourselves against a single axis is flawed
link |
because we're not, it's a multidimensional space
link |
and how do you order a multidimensional space?
link |
There isn't one ordering.
link |
So this whole idea, you immediately get projected
link |
into a thing when you're talking about hiring
link |
or best or worst or better or not better.
link |
So what is the thing you're actually needing?
link |
And you can hire for that.
link |
There is such a thing, generally, I really value people
link |
who have the affect, that care about open source.
link |
Like so in some cases, their affinity to open source
link |
is simply kind of a filter of an affect.
link |
However, I have found this interesting dichotomy
link |
between open source contributors and product creation.
link |
There's, I don't know if it's fully true,
link |
but there does seem to be the more experienced,
link |
the more affect somebody has an open source community,
link |
the less ability to actually produce product that they have.
link |
And the opposite is kind of true too.
link |
The more product focused are, I find a lot of people,
link |
I've talked to a lot of people who produce
link |
really great products and they have a,
link |
they're looking over the open source communities,
link |
kind of wanting to participate and play,
link |
but they've played here and they do a great job here
link |
and then they don't necessarily have some of the same.
link |
Now I don't think that's entirely necessary.
link |
I think part of it is cultural, how they've emerged.
link |
Because one of the things that open source communities
link |
often lack is great product management,
link |
like some product management energy.
link |
That's brilliant, but you want both of those energies
link |
in the same place together.
link |
Yes, you really do.
link |
And so a lot of it's creating these teams of people
link |
that have these needed skills and attributes
link |
And so one of the big things I look for is somebody
link |
that fundamentally recognizes their need to learn.
link |
Like one of the values that we have
link |
in all of the things we do is learning.
link |
Like if somebody thinks they know it all,
link |
they're gonna struggle.
link |
And some of that is just, there's more basic things
link |
like humility, just being humble in the face
link |
of all the things you don't know.
link |
And that's step one of learning.
link |
That's step one of learning, right?
link |
And I've spent a lot of time learning, right?
link |
Other people spend a lot more time,
link |
but I've spent a lot of time learning.
link |
My whole goal was to get a PhD because I love school
link |
and I wanted to be a scientist.
link |
And then what I found is what's been written about
link |
elsewhere as well is the more I learned,
link |
the more I didn't know.
link |
The more I realized, man, I know about this,
link |
but this is such a tiny thing in the global scope
link |
of what I might wanna know about.
link |
So I need to be listening a whole lot better
link |
than I am just talking.
link |
That's changed a little bit actually.
link |
My wife says that I used to be a better listener.
link |
Now that I'm so full of all these ideas I wanna do,
link |
she kind of says, you gotta give people time to talk.
link |
So you've succeeded on multiple dimensions.
link |
So one is the tenure track faculty.
link |
The other is just creating all these products
link |
and building up the businesses,
link |
then working with businesses.
link |
Do you have advice for young people today
link |
in high school and college of how to live a life
link |
as nonlinear and as successful as yours,
link |
a life that they could be proud of?
link |
Well, that's a super compliment.
link |
I'm humbled by that actually.
link |
I would say a life they can be proud of.
link |
Honestly, one thing that I've said to people is first,
link |
find people you love and care about them.
link |
Like family matters to me a lot.
link |
And family means people you love and have committed to.
link |
So it can be whatever you mean by that,
link |
but you need to have a foundation.
link |
So find people you love and wanna commit to and do that.
link |
Cause it anchors you in a way that nothing else can.
link |
And then you find other things.
link |
And then kind of from out there,
link |
you find other kinds of things you can commit to,
link |
whether it's ideas or people or groups of people.
link |
So, especially in high school,
link |
I would say don't settle on what you think you know.
link |
Like give yourself 10 years to think about the world.
link |
Like I see a lot of high school students
link |
who seem to know everything already.
link |
I think I did too.
link |
I think it's maybe natural,
link |
but recognize that the things you care about,
link |
you might change your perspective over time.
link |
I certainly have over time.
link |
I was really passionate about one specific thing
link |
and I was kind of softened.
link |
I was a big, I didn't like the Federal Reserve, right?
link |
And there's still, we could have a longer conversation
link |
about monetary policy and finances,
link |
but I'm a little more nuanced in my perspective
link |
But that's one area where you learn about something,
link |
go, ah, I wanna attack it.
link |
Build, don't destroy.
link |
Build, like so often the tendency is to not like something
link |
and wanna go attack it.
link |
Build something, build something to replace it.
link |
Build up, attract people to your new thing.
link |
You'll be far better, right?
link |
You don't need to destroy something to build something else.
link |
So that's, I guess, generally.
link |
And then definitely like curiosity,
link |
follow your curiosity and let it,
link |
don't just follow the money.
link |
And all of that, like you said,
link |
is grounded in family, friendship, and ultimately love.
link |
Which is a great way to end it.
link |
Travis, you're one of the most impactful people
link |
in the engineering and the computer science
link |
in the human world.
link |
So I truly appreciate everything you've done.
link |
And I really appreciate that you would spend
link |
your valuable time with me.
link |
It was a real pleasure for me.
link |
I appreciate that.
link |
Thanks for listening to this conversation
link |
with Travis Oliphant.
link |
To support this podcast,
link |
please check out our sponsors in the description.
link |
And now, let me leave you with something
link |
that in the programming world is called Hodgson's Law.
link |
Every sufficiently advanced Lisp application
link |
will eventually be re implemented in Python.
link |
Thank you for listening and hope to see you next time.