back to indexJeffrey Shainline: Neuromorphic Computing and Optoelectronic Intelligence | Lex Fridman Podcast #225
link |
The following is a conversation with Jeff Schoenlein,
link |
a scientist at NIST
link |
interested in optoelectronic intelligence.
link |
We have a deep technical dive into computing hardware
link |
that will make Jim Keller proud.
link |
I urge you to hop onto this rollercoaster ride
link |
through neuromorphic computing
link |
and superconducting electronics
link |
and hold on for dear life.
link |
Jeff is a great communicator of technical information
link |
and so it was truly a pleasure to talk to him
link |
about some physics and engineering.
link |
To support this podcast,
link |
please check out our sponsors in the description.
link |
This is the Lex Friedman Podcast
link |
and here is my conversation with Jeff Schoenlein.
link |
I got a chance to read a fascinating paper you authored
link |
called Optoelectronic Intelligence.
link |
So maybe we can start by talking about this paper
link |
and start with the basic questions.
link |
What is optoelectronic intelligence?
link |
Yeah, so in that paper,
link |
the concept I was trying to describe
link |
is sort of an architecture
link |
for building brain inspired computing
link |
that leverages light for communication
link |
in conjunction with electronic circuits for computation.
link |
In that particular paper,
link |
a lot of the work we're doing right now
link |
in our project at NIST
link |
is focused on superconducting electronics for computation.
link |
I won't go into why that is,
link |
but that might make a little more sense in context
link |
if we first describe what that is in contrast to,
link |
which is semiconducting electronics.
link |
So is it worth taking a couple minutes
link |
to describe semiconducting electronics?
link |
It might even be worthwhile to step back
link |
and talk about electricity and circuits
link |
and how circuits work
link |
before we talk about superconductivity.
link |
How does a computer work, Jeff?
link |
Well, I won't go into everything
link |
that makes a computer work,
link |
but let's talk about the basic building blocks,
link |
a transistor, and even more basic than that,
link |
a semiconductor material, silicon, say.
link |
So in silicon, silicon is a semiconductor,
link |
and what that means is at low temperature,
link |
there are no free charges,
link |
no free electrons that can move around.
link |
So when you talk about electricity,
link |
you're talking about predominantly electrons
link |
moving to establish electrical currents,
link |
and they move under the influence of voltages.
link |
So you apply voltages, electrons move around,
link |
those can be measured as currents,
link |
and you can represent information in that way.
link |
So semiconductors are special
link |
in the sense that they are really malleable.
link |
So if you have a semiconductor material,
link |
you can change the number of free electrons
link |
that can move around by putting different elements,
link |
different atoms in lattice sites.
link |
So what is a lattice site?
link |
Well, a semiconductor is a crystal,
link |
which means all the atoms that comprise the material
link |
are at exact locations
link |
that are perfectly periodic in space.
link |
So if you start at any one atom
link |
and you go along what are called the lattice vectors,
link |
you get to another atom and another atom and another atom,
link |
and for high quality devices,
link |
it's important that it's a perfect crystal
link |
with very few defects,
link |
but you can intentionally replace a silicon atom
link |
with say a phosphorus atom,
link |
and then you can change the number of free electrons
link |
that are in a region of space
link |
that has that excess of what are called dopants.
link |
So picture a device that has a left terminal
link |
and a right terminal,
link |
and if you apply a voltage between those two,
link |
you can cause electrical current to flow between them.
link |
Now we add a third terminal up on top there,
link |
and depending on the voltage
link |
between the left and right terminal and that third voltage,
link |
you can change that current.
link |
So what's commonly done in digital electronic circuits
link |
is to leave a fixed voltage from left to right,
link |
and then change that voltage
link |
that's applied at what's called the gate,
link |
the gate of the transistor.
link |
So what you do is you make it to where
link |
there's an excess of electrons on the left,
link |
excess of electrons on the right,
link |
and very few electrons in the middle,
link |
and you do this by changing the concentration
link |
of different dopants in the lattice spatially.
link |
And then when you apply a voltage to that gate,
link |
you can either cause current to flow or turn it off,
link |
and so that's sort of your zero and one.
link |
If you apply voltage, current can flow,
link |
that current is representing a digital one,
link |
and from that, from that basic element,
link |
you can build up all the complexity
link |
of digital electronic circuits
link |
that have really had a profound influence on our society.
link |
Now you're talking about electrons.
link |
Can you give a sense of what scale we're talking about
link |
when we're talking about in silicon
link |
being able to mass manufacture these kinds of gates?
link |
Yeah, so scale in a number of different senses.
link |
Well, at the scale of the silicon lattice,
link |
the distance between two atoms there is half a nanometer.
link |
So people often like to compare these things
link |
to the width of a human hair.
link |
I think it's some six orders of magnitude smaller
link |
than the width of a human hair, something on that order.
link |
So remarkably small,
link |
we're talking about individual atoms here,
link |
and electrons are of that length scale
link |
when they're in that environment.
link |
But there's another sense
link |
that scale matters in digital electronics.
link |
This is perhaps the more important sense,
link |
although they're related.
link |
Scale refers to a number of things.
link |
It refers to the size of that transistor.
link |
So for example, I said you have a left contact,
link |
a right contact, and some space between them
link |
where the gate electrode sits.
link |
That's called the channel width or the channel length.
link |
And what has enabled what we think of as Moore's law
link |
or the continued increased performance
link |
in silicon microelectronic circuits
link |
is the ability to make that size, that feature size,
link |
ever smaller, ever smaller at a really remarkable pace.
link |
I mean, that feature size has decreased consistently
link |
every couple of years since the 1960s.
link |
And that was what Moore predicted in the 1960s.
link |
He thought it would continue for at least two more decades,
link |
and it's been much longer than that.
link |
And so that is why we've been able to fit ever more devices,
link |
ever more transistors, ever more computational power
link |
on essentially the same size of chip.
link |
So a user sits back and does essentially nothing.
link |
You're running the same computer program,
link |
but those devices are getting smaller, so they get faster,
link |
they get more energy efficient,
link |
and all of our computing performance
link |
just continues to improve.
link |
And we don't have to think too hard
link |
about what we're doing as, say,
link |
a software designer or something like that.
link |
I absolutely don't mean to say
link |
that there's no innovation in software or the user side
link |
of things, of course there is.
link |
But from the hardware perspective,
link |
we just have been given this gift
link |
of continued performance improvement
link |
through this scaling that is ever smaller feature sizes
link |
with very similar, say, power consumption.
link |
That power consumption has not continued to scale
link |
in the most recent decades, but nevertheless,
link |
we had a really good run there for a while.
link |
And now we're down to gates that are seven nanometers,
link |
which is state of the art right now.
link |
Maybe GlobalFoundries is trying to push it
link |
even lower than that.
link |
I can't keep up with where the predictions are
link |
that it's gonna end.
link |
But seven nanometer transistor has just a few tens of atoms
link |
along the length of the conduction pathway.
link |
So a naive semiconductor device physicist
link |
would think you can't go much further than that
link |
without some kind of revolution in the way we think
link |
about the physics of our devices.
link |
Is there something to be said
link |
about the mass manufacture of these devices?
link |
Right, right, so that's another thing.
link |
So how have we been able
link |
to make those transistors smaller and smaller?
link |
Well, companies like Intel, GlobalFoundries,
link |
they invest a lot of money in the lithography.
link |
So how are these chips actually made?
link |
Well, one of the most important steps
link |
is this what's called ion implantation.
link |
So you start with sort of a pristine silicon crystal
link |
and then using photolithography,
link |
which is a technique where you can pattern
link |
different shapes using light,
link |
you can define which regions of space
link |
you're going to implant with different species of ions
link |
that are going to change
link |
the local electrical properties right there.
link |
So by using ever shorter wavelengths of light
link |
and different kinds of optical techniques
link |
and different kinds of lithographic techniques,
link |
things that go far beyond my knowledge base,
link |
you can just simply shrink that feature size down.
link |
And you say you're at seven nanometers.
link |
Well, the wavelength of light that's being used
link |
is over a hundred nanometers.
link |
That's already deep in the UV.
link |
So how are those minute features patterned?
link |
Well, there's an extraordinary amount of innovation
link |
that has gone into that,
link |
but nevertheless, it stayed very consistent
link |
in this ever shrinking feature size.
link |
And now the question is, can you make it smaller?
link |
And even if you do, do you still continue
link |
to get performance improvements?
link |
But that's another kind of scaling
link |
where these companies have been able to...
link |
So, okay, you picture a chip that has a processor on it.
link |
Well, that chip is not made as a chip.
link |
It's made on a wafer.
link |
And using photolithography,
link |
you basically print the same pattern on different dyes
link |
all across the wafer, multiple layers,
link |
tens, probably a hundred some layers
link |
in a mature foundry process.
link |
And you do this on ever bigger wafers too.
link |
That's another aspect of scaling
link |
that's occurred in the last several decades.
link |
So now you have this 300 millimeter wafer.
link |
It's like as big as a pizza
link |
and it has maybe a thousand processors on it.
link |
And then you dice that up using a saw.
link |
And now you can sell these things so cheap
link |
because the manufacturing process was so streamlined.
link |
I think a technology as revolutionary
link |
as silicon microelectronics has to have
link |
that kind of manufacturing scalability,
link |
which I will just emphasize,
link |
I believe is enabled by physics.
link |
It's not, I mean, of course there's human ingenuity
link |
that goes into it, but at least from my side where I sit,
link |
it sure looks like the physics of our universe
link |
allows us to produce that.
link |
And we've discovered how more so than we've invented it,
link |
although of course we have invented it,
link |
humans have invented it,
link |
but it's almost as if it was there
link |
waiting for us to discover it.
link |
You mean the entirety of it
link |
or are you specifically talking about
link |
the techniques of photolithography,
link |
like the optics involved?
link |
I mean, the entirety of the scaling down
link |
to the seven nanometers,
link |
you're able to have electrons not interfere with each other
link |
in such a way that you could still have gates.
link |
Like that's enabled.
link |
To achieve that scale, spatial and temporal,
link |
it seems to be very special
link |
and is enabled by the physics of our world.
link |
All of the things you just said.
link |
So starting with the silicon material itself,
link |
silicon is a unique semiconductor.
link |
It has essentially ideal properties
link |
for making a specific kind of transistor
link |
that's extraordinarily useful.
link |
So I mentioned that silicon has,
link |
well, when you make a transistor,
link |
you have this gate contact
link |
that sits on top of the conduction channel.
link |
And depending on the voltage you apply there,
link |
you pull more carriers into the conduction channel
link |
or push them away so it becomes more or less conductive.
link |
In order to have that work
link |
without just sucking those carriers right into that contact,
link |
you need a very thin insulator.
link |
And part of scaling has been to gradually decrease
link |
the thickness of that gate insulator
link |
so that you can use a roughly similar voltage
link |
and still have the same current voltage characteristics.
link |
So the material that's used to do that,
link |
or I should say was initially used to do that
link |
was a silicon dioxide,
link |
which just naturally grows on the silicon surface.
link |
So you expose silicon to the atmosphere that we breathe
link |
and well, if you're manufacturing,
link |
you're gonna purify these gases,
link |
that what's called a native oxide will grow there.
link |
There are essentially no other materials
link |
on the entire periodic table
link |
that have as good of a gate insulator
link |
as that silicon dioxide.
link |
And that has to do with nothing but the physics
link |
of the interaction between silicon and oxygen.
link |
And if it wasn't that way,
link |
transistors could not perform
link |
in nearly the degree of capability that they have.
link |
And that has to do with the way that the oxide grows,
link |
the reduced density of defects there,
link |
it's insulation, meaning essentially it's energy gaps.
link |
You can apply a very large voltage there
link |
without having current leak through it.
link |
So that's physics right there.
link |
There are other things too.
link |
Silicon is a semiconductor in an elemental sense.
link |
You only need silicon atoms.
link |
A lot of other semiconductors,
link |
you need two different kinds of atoms,
link |
like a compound from group three
link |
and a compound from group five.
link |
That opens you up to lots of defects that can occur
link |
where one atom's not sitting quite at the lattice site,
link |
it is and it's switched with another one
link |
that degrades performance.
link |
But then also on the side that you mentioned
link |
with the manufacturing,
link |
we have access to light sources
link |
that can produce these very short wavelengths of light.
link |
How does photolithography occur?
link |
Well, you actually put this polymer on top of your wafer
link |
and you expose it to light,
link |
and then you use a aqueous chemical processing
link |
to dissolve away the regions that were exposed to light
link |
and leave the regions that were not.
link |
And we are blessed with these polymers
link |
that have the right property
link |
where they can cause scission events
link |
where the polymer splits where a photon hits.
link |
I mean, maybe that's not too surprising,
link |
but I don't know, it all comes together
link |
to have this really complex,
link |
manufacturable ecosystem
link |
where very sophisticated technologies can be devised
link |
and it works quite well.
link |
And amazingly, like you said,
link |
with a wavelength at like 100 nanometers
link |
or something like that,
link |
you're still able to achieve on this polymer
link |
precision of whatever we said, seven nanometers.
link |
I think I've heard like four nanometers
link |
being talked about, something like that.
link |
If we could just pause on this
link |
and we'll return to superconductivity,
link |
but in this whole journey from a history perspective,
link |
what do you think is the most beautiful
link |
at the intersection of engineering and physics
link |
to you in this whole process
link |
that we talked about with silicon and photolithography,
link |
things that people were able to achieve
link |
in order to push Moore's law forward?
link |
Is it the early days,
link |
the invention of the transistor itself?
link |
Is it some particular cool little thing
link |
that maybe not many people know about?
link |
Like, what do you think is most beautiful
link |
in this whole process, journey?
link |
The most beautiful is a little difficult to answer.
link |
Let me try and sidestep it a little bit
link |
and just say what strikes me about looking
link |
at the history of silicon microelectronics is that,
link |
so when quantum mechanics was developed,
link |
people quickly began applying it to semiconductors
link |
and it was broadly understood
link |
that these are fascinating systems
link |
and people cared about them for their basic physics,
link |
but also their utility as devices.
link |
And then the transistor was invented in the late forties
link |
in a relatively crude experimental setup
link |
where you just crammed a metal electrode
link |
into the semiconductor and that was ingenious.
link |
These people were able to make it work.
link |
But so what I wanna get to that really strikes me
link |
is that in those early days,
link |
there were a number of different semiconductors
link |
that were being considered.
link |
They had different properties, different strengths,
link |
different weaknesses.
link |
Most people thought germanium was the way to go.
link |
It had some nice properties related to things
link |
about how the electrons move inside the lattice.
link |
But other people thought that compound semiconductors
link |
with group three and group five also had
link |
really, really extraordinary properties
link |
that might be conducive to making the best devices.
link |
So there were different groups exploring each of these
link |
and that's great, that's how science works.
link |
You have to cast a broad net.
link |
But then what I find striking is why is it that silicon won?
link |
Because it's not that germanium is a useless material
link |
and it's not present in technology
link |
or compound semiconductors.
link |
They're both doing exciting and important things,
link |
slightly more niche applications
link |
whereas silicon is the semiconductor material
link |
for microelectronics which is the platform
link |
for digital computing which has transformed our world.
link |
Why did silicon win?
link |
It's because of a remarkable assemblage of qualities
link |
that no one of them was the clear winner
link |
but it made these sort of compromises
link |
between a number of different influences.
link |
It had that really excellent gate oxide
link |
that allowed us to make MOSFETs,
link |
these high performance transistors,
link |
so quickly and cheaply and easily
link |
without having to do a lot of materials development.
link |
The band gap of silicon is actually,
link |
so in a semiconductor there's an important parameter
link |
which is called the band gap
link |
which tells you there are sort of electrons
link |
that fill up to one level in the energy diagram
link |
and then there's a gap where electrons aren't allowed
link |
to have an energy in a certain range
link |
and then there's another energy level above that.
link |
And that difference between the lower sort of filled level
link |
and the unoccupied level,
link |
that tells you how much voltage you have to apply
link |
in order to induce a current to flow.
link |
So with germanium, that's about 0.75 electron volts.
link |
That means you have to apply 0.75 volts
link |
to get a current moving.
link |
And it turns out that if you compare that
link |
to the thermal excitations that are induced
link |
just by the temperature of our environment,
link |
that gap's not quite big enough.
link |
You start to use it to perform computations,
link |
it gets a little hot and you get all these accidental
link |
carriers that are excited into the conduction band
link |
and it causes errors in your computation.
link |
Silicon's band gap is just a little higher,
link |
1.1 electron volts,
link |
but you have an exponential dependence
link |
on the number of carriers that are present
link |
that can induce those errors.
link |
It decays exponentially with that voltage.
link |
So just that slight extra energy in that band gap
link |
really puts it in an ideal position to be operated
link |
in the conditions of our ambient environment.
link |
It's kind of fascinating that, like you mentioned,
link |
errors decrease exponentially with the voltage.
link |
So it's funny because this error thing comes up
link |
when you start talking about quantum computing.
link |
And it's kind of amazing that everything
link |
we've been talking about, the errors,
link |
as we scale down, seems to be extremely low.
link |
And like all of our computation is based
link |
on the assumption that it's extremely low.
link |
Yes, well it's digital computation.
link |
Digital, sorry, digital computation.
link |
So as opposed to our biological computation in our brain,
link |
is like the assumption is stuff is gonna fail
link |
all over the place and we somehow
link |
have to still be robust to that.
link |
That's exactly right.
link |
So this also, this is gonna be the most controversial part
link |
of our conversation where you're gonna make some enemies.
link |
because we've been talking about physics and engineering.
link |
Which group of people is smarter
link |
and more important for this one?
link |
Let me ask the question in a better way.
link |
Some of the big innovations,
link |
some of the beautiful things that we've been talking about,
link |
how much of it is physics?
link |
How much of it is engineering?
link |
My dad is a physicist and he talks down
link |
to all the amazing engineering that we're doing
link |
in the artificial intelligence and the computer science
link |
and the robotics and all that space.
link |
So we argue about this all the time.
link |
So what do you think?
link |
Who gets more credit?
link |
I'm genuinely not trying to just be politically correct here.
link |
I don't see how you would have any of the,
link |
what we consider sort of the great accomplishments
link |
of society without both.
link |
You absolutely need both of those things.
link |
Physics tends to play a key role earlier in the development
link |
and then engineering optimization, these things take over.
link |
And I mean, the invention of the transistor
link |
or actually even before that,
link |
the understanding of semiconductor physics
link |
that allowed the invention of the transistor,
link |
that's all physics.
link |
So if you didn't have that physics,
link |
you don't even get to get on the field.
link |
But once you have understood and demonstrated
link |
that this is in principle possible,
link |
more so as engineering.
link |
Why we have computers more powerful
link |
than old supercomputers in each of our phones,
link |
that's all engineering.
link |
And I think I would be quite foolish to say
link |
that that's not valuable, that's not a great contribution.
link |
It's a beautiful dance.
link |
Would you put like Silicon,
link |
the understanding of the material properties
link |
in the space of engineering?
link |
Like how does that whole process work?
link |
To understand that it has all these nice properties
link |
or even the development of photolithography,
link |
is that basically,
link |
would you put that in a category of engineering?
link |
No, I would say that it is basic physics,
link |
it is applied physics, it's material science,
link |
it's X ray crystallography, it's polymer chemistry,
link |
Chemistry even is thrown in there?
link |
Absolutely, yes, absolutely.
link |
We can get to biology.
link |
Or the biologies and the humans
link |
that are engineering the system,
link |
so it's all integrated deeply.
link |
Okay, so let's return,
link |
you mentioned this word superconductivity.
link |
So what does that have to do with what we're talking about?
link |
Right, okay, so in a semiconductor,
link |
as I tried to describe a second ago,
link |
you can sort of induce currents by applying voltages
link |
and those have sort of typical properties
link |
that you would expect from some kind of a conductor.
link |
Those electrons, they don't just flow
link |
perfectly without dissipation.
link |
If an electron collides with an imperfection in the lattice
link |
or another electron, it's gonna slow down,
link |
it's gonna lose its momentum.
link |
So you have to keep applying that voltage
link |
in order to keep the current flowing.
link |
In a superconductor, something different happens.
link |
If you get a current to start flowing,
link |
it will continue to flow indefinitely.
link |
There's no dissipation.
link |
How does that happen?
link |
Well, it happens at low temperature and this is crucial.
link |
It has to be a quite low temperature
link |
and what I'm talking about there,
link |
for essentially all of our conversation,
link |
I'm gonna be talking about conventional superconductors,
link |
sometimes called low TC superconductors,
link |
low critical temperature superconductors.
link |
And so those materials have to be at a temperature around,
link |
say around four Kelvin.
link |
I mean, their critical temperature might be 10 Kelvin,
link |
something like that,
link |
but you wanna operate them at around four Kelvin,
link |
four degrees above absolute zero.
link |
And what happens at that temperature,
link |
at very low temperatures in certain materials
link |
is that the noise of atoms moving around,
link |
the lattice vibrating, electrons colliding with each other,
link |
that becomes sufficiently low
link |
that the electrons can settle into this very special state.
link |
It's sometimes referred to as a macroscopic quantum state
link |
because if I had a piece of superconducting material here,
link |
let's say niobium is a very typical superconductor.
link |
If I had a block of niobium here
link |
and we cooled it below its critical temperature,
link |
all of the electrons in that superconducting state
link |
would be in one coherent quantum state.
link |
The wave function of that state is described
link |
in terms of all of the particles simultaneously,
link |
but it extends across macroscopic dimensions,
link |
the size of whatever block of that material
link |
I have sitting here.
link |
And the way this occurs is that,
link |
let's try to be a little bit light on the technical details,
link |
but essentially the electrons coordinate with each other.
link |
They are able to, in this macroscopic quantum state,
link |
they're able to sort of,
link |
one can quickly take the place of the other.
link |
You can't tell electrons apart.
link |
They're what's known as identical particles.
link |
So if this electron runs into a defect
link |
that would otherwise cause it to scatter,
link |
it can just sort of almost miraculously avoid that defect
link |
because it's not really in that location.
link |
It's part of a macroscopic quantum state
link |
and the entire quantum state
link |
was not scattered by that defect.
link |
So you can get a current that flows without dissipation
link |
and that's called a supercurrent.
link |
That's sort of just very much scratching the surface
link |
of superconductivity.
link |
There's very deep and rich physics there,
link |
just probably not the main subject
link |
we need to go into right now.
link |
But it turns out that when you have this material,
link |
you can do usual things like make wires out of it
link |
so you can get current to flow in a straight line on a chip,
link |
but you can also make other devices
link |
that perform different kinds of operations.
link |
Some of them are kind of logic operations
link |
like you'd get in a transistor.
link |
The most common or the most,
link |
I would say, diverse in its utility component
link |
is a Josephson junction.
link |
It's not analogous to a transistor
link |
in the sense that if you apply a voltage here,
link |
it changes how much current flows from left to right,
link |
but it is analogous in sort of a sense
link |
of it's the go to component
link |
that a circuit engineer is going to use
link |
to start to build up more complexity.
link |
So these junctions serve as gates.
link |
They can serve as gates.
link |
So I'm not sure how concerned to be with semantics,
link |
but let me just briefly say what a Josephson junction is
link |
and we can talk about different ways that they can be used.
link |
Basically, if you have a superconducting wire
link |
and then a small gap of a different material
link |
that's not superconducting, an insulator or normal metal,
link |
and then another superconducting wire on the other side,
link |
that's a Josephson junction.
link |
So it's sometimes referred to
link |
as a superconducting weak link.
link |
So you have this superconducting state on one side
link |
and on the other side, and the superconducting wave function
link |
actually tunnels across that gap.
link |
And when you create such a physical entity,
link |
it has very unusual current voltage characteristics.
link |
In that gap, like weird stuff happens.
link |
Through the entire circuit.
link |
So you can imagine, suppose you had a loop set up
link |
that had one of those weak links in the loop.
link |
Current would flow in that loop independent,
link |
even if you hadn't applied a voltage to it,
link |
and that's called the Josephson effect.
link |
So the fact that there's this phase difference
link |
in the quantum wave function from one side
link |
of the tunneling barrier to the other
link |
induces current to flow.
link |
So how does you change state?
link |
So how do you change state?
link |
Now picture if I have a current bias coming down
link |
this line of my circuit and there's a Josephson junction
link |
right in the middle of it.
link |
And now I make another wire
link |
that goes around the Josephson junction.
link |
So I have a loop here, a superconducting loop.
link |
I can add current to that loop by exceeding
link |
the critical current of that Josephson junction.
link |
So like any superconducting material,
link |
it can carry this supercurrent that I've described,
link |
this current that can propagate without dissipation
link |
up to a certain level.
link |
And if you try and pass more current than that
link |
through the material, it's going to become
link |
a resistive material, normal material.
link |
So in the Josephson junction, the same thing happens.
link |
I can bias it above its critical current.
link |
And then what it's going to do,
link |
it's going to add a quantized amount of current
link |
And what I mean by quantized is it's going to come
link |
in discrete packets with a well defined value of current.
link |
So in the vernacular of some people working
link |
in this community, you would say you pop a flux on
link |
You pop a flux on into the loop.
link |
Yeah, so a flux on.
link |
Sounds like skateboarder talk, I love it.
link |
Okay, sorry, go ahead.
link |
A flux on is one of these quantized sort of amounts
link |
of current that you can add to a loop.
link |
And this is a cartoon picture,
link |
but I think it's sufficient for our purposes.
link |
So which, maybe it's useful to say,
link |
what is the speed at which these discrete packets
link |
of current travel?
link |
Because we'll be talking about light a little bit.
link |
It seems like the speed is important.
link |
The speed is important, that's an excellent question.
link |
Sometimes I wonder where you, how you became so astute.
link |
Matrix 4 is coming out, so maybe that's related.
link |
I'm dressed for the job.
link |
I was trying to get to become an extra on Matrix 4,
link |
Anyway, so what's the speed of these packets?
link |
You'll have to find another gig.
link |
I know, I'm sorry.
link |
So the speed of the pack is actually these flux ons,
link |
these sort of pulses of current
link |
that are generated by Joseph's injunctions,
link |
they can actually propagate very close
link |
to the speed of light,
link |
maybe something like a third of the speed of light.
link |
That's quite fast.
link |
So one of the reasons why Joseph's injunctions are appealing
link |
is because their signals can propagate quite fast
link |
and they can also switch very fast.
link |
What I mean by switch is perform that operation
link |
that I described where you add current to the loop.
link |
That can happen within a few tens of picoseconds.
link |
So you can get devices that operate
link |
in the hundreds of gigahertz range.
link |
And by comparison, most processors
link |
in our conventional computers operate closer
link |
to the one gigahertz range, maybe three gigahertz
link |
seems to be kind of where those speeds have leveled out.
link |
The gamers listening to this are getting really excited
link |
to overclock their system to like, what is it?
link |
Like four gigahertz or something,
link |
a hundred sounds incredible.
link |
Can I just as a tiny tangent,
link |
is the physics of this understood well
link |
how to do this stably?
link |
Oh yes, the physics is understood well.
link |
The physics of Joseph's injunctions is understood well.
link |
The technology is understood quite well too.
link |
The reasons why it hasn't displaced
link |
silicon microelectronics in conventional digital computing
link |
I think are more related to what I was alluding to before
link |
about the myriad practical, almost mundane aspects
link |
of silicon that make it so useful.
link |
You can make a transistor ever smaller and smaller
link |
and it will still perform its digital function quite well.
link |
The same is not true of a Joseph's injunction.
link |
You really, they don't, they just,
link |
it's not the same thing that there's this feature
link |
that you can keep making smaller and smaller
link |
and it'll keep performing the same operations.
link |
This loop I described, any Joseph's in circuit,
link |
well, I wanna be careful, I shouldn't say
link |
any Joseph's in circuit, but many Joseph's in circuits,
link |
the way they process information
link |
or the way they perform whatever function it is
link |
they're trying to do,
link |
maybe it's sensing a weak magnetic field,
link |
it depends on an interplay between the junction
link |
And you can't make that loop much smaller.
link |
And it's not for practical reasons
link |
that have to do with lithography.
link |
It's for fundamental physical reasons
link |
about the way the magnetic field interacts
link |
with that superconducting material.
link |
There are physical limits that no matter how good
link |
our technology got, those circuits would,
link |
I think would never be able to be scaled down
link |
to the densities that silicon microelectronics can.
link |
I don't know if we mentioned,
link |
is there something interesting
link |
about the various superconducting materials involved
link |
There's a lot of stuff that's interesting.
link |
And it's not silicon.
link |
It's not silicon, no.
link |
So like it's some materials that also required
link |
to be super cold, four Kelvin and so on.
link |
So let's dissect a couple of those different things.
link |
The super cold part,
link |
let me just mention for your gamers out there
link |
that are trying to clock it at four gigahertz
link |
and would love to go to 400.
link |
What kind of cooling system can achieve four Kelvin?
link |
Four Kelvin, you need liquid helium.
link |
And so liquid helium is expensive.
link |
It's inconvenient.
link |
You need a cryostat that sits there
link |
and the energy consumption of that cryostat
link |
is impracticable for, it's not going in your cell phone.
link |
So you can picture holding your cell phone like this
link |
and then something the size of a keg of beer or something
link |
on your back to cool it.
link |
Like that makes no sense.
link |
So if you're trying to make this in consumer devices,
link |
electronics that are ubiquitous across society,
link |
superconductors are not in the race for that.
link |
For now, but you're saying,
link |
so just to frame the conversation,
link |
maybe the thing we're focused on
link |
is computing systems that serve as servers, like large.
link |
Yes, large systems.
link |
So then you can contrast what's going on in your cell phone
link |
with what's going on at one of the supercomputers.
link |
Colleague Katie Schuman invited us out to Oak Ridge
link |
a few years ago, so we got to see Titan
link |
and that was when they were building Summit.
link |
So these are some high performance supercomputers
link |
out in Tennessee and those are filling entire rooms
link |
the size of warehouses.
link |
So once you're at that level, okay,
link |
there you're already putting a lot of power into cooling.
link |
Cooling is part of your engineering task
link |
that you have to deal with.
link |
So there it's not entirely obvious
link |
that cooling to four Kelvin is out of the question.
link |
It has not happened yet and I can speak to why that is
link |
in the digital domain if you're interested.
link |
I think it's not going to happen.
link |
I don't think superconductors are gonna replace
link |
semiconductors for digital computation.
link |
There are a lot of reasons for that,
link |
but I think ultimately what it comes down to
link |
is all things considered cooling errors,
link |
scaling down to feature sizes, all that stuff,
link |
semiconductors work better at the system level.
link |
Is there some aspect of just curious
link |
about the historical momentum of this?
link |
Is there some power to the momentum of an industry
link |
that's mass manufacturing using a certain material?
link |
Is this like a Titanic shifting?
link |
Like what's your sense when a good idea comes along,
link |
how good does that idea need to be
link |
for the Titanic to start shifting?
link |
That's an excellent question.
link |
That's an excellent way to frame it.
link |
And you know, I don't know the answer to that,
link |
but what I think is, okay,
link |
so the history of the superconducting logic
link |
goes back to the 70s.
link |
IBM made a big push to do
link |
superconducting digital computing in the 70s.
link |
And they made some choices about their devices
link |
and their architectures and things that in hindsight,
link |
were kind of doomed to fail.
link |
And I don't mean any disrespect for the people that did it,
link |
it was hard to see at the time.
link |
But then another generation of superconducting logic
link |
was introduced, I wanna say the 90s,
link |
someone named Lykarev and Seminov,
link |
they proposed an entire family of circuits
link |
based on Joseph's injunctions
link |
that are doing digital computing based on logic gates
link |
and or not these kinds of things.
link |
And they showed how it could go hundreds of times faster
link |
than silicon microelectronics.
link |
And it's extremely exciting.
link |
I wasn't working in the field at that time,
link |
but later when I went back and read the literature,
link |
I was just like, wow, this is so awesome.
link |
And so you might think, well,
link |
the reason why it didn't display silicon
link |
is because silicon already had so much momentum
link |
But that was the 90s.
link |
Silicon kept that momentum
link |
because it had the simple way to keep getting better.
link |
You just make features smaller and smaller.
link |
So it would have to be,
link |
I don't think it would have to be that much better
link |
than silicon to displace it.
link |
But the problem is it's just not better than silicon.
link |
It might be better than silicon in one metric,
link |
speed of a switching operation
link |
or power consumption of a switching operation.
link |
But building a digital computer is a lot more
link |
than just that elemental operation.
link |
It's everything that goes into it,
link |
including the manufacturing, including the packaging,
link |
including the various materials aspects of things.
link |
So the reason why,
link |
and even in some of those early papers,
link |
I can't remember which one it was,
link |
Lykarev said something along the lines of,
link |
you can see how we could build an entire family
link |
of digital electronic circuits based on these components.
link |
They could go a hundred or more times faster
link |
than semiconductor logic gates.
link |
But I don't think that's the right way
link |
to use superconducting electronic circuits.
link |
He didn't say what the right way was,
link |
but he basically said digital logic,
link |
trying to steal the show from silicon
link |
is probably not what these circuits
link |
are most suited to accomplish.
link |
So if we can just linger and use the word computation.
link |
When you talk about computation, how do you think about it?
link |
Do you think purely on just the switching,
link |
or do you think something a little bit larger scale,
link |
a circuit taken together,
link |
performing the basic arithmetic operations
link |
that are then required to do the kind of computation
link |
that makes up a computer?
link |
Because when we talk about the speed of computation,
link |
is it boiled down to the basic switching,
link |
or is there some bigger picture
link |
that you're thinking about?
link |
Well, all right, so maybe we should disambiguate.
link |
There are a variety of different kinds of computation.
link |
I don't pretend to be an expert
link |
in the theory of computation or anything like that.
link |
I guess it's important to differentiate though
link |
between digital logic,
link |
which represents information as a series of bits,
link |
binary digits, which you can think of them
link |
as zeros and ones or whatever.
link |
Usually they correspond to a physical system
link |
that has two very well separated states.
link |
And then other kinds of computation,
link |
like we'll get into more the way your brain works,
link |
which it is, I think,
link |
indisputably processing information,
link |
but where the computation begins and ends
link |
is not anywhere near as well defined.
link |
It doesn't depend on these two levels.
link |
Here's a zero, here's a one.
link |
There's a lot of gray area
link |
that's usually referred to as analog computing.
link |
Also in conventional digital computers
link |
or digital computers in general,
link |
you have a concept of what's called arithmetic depth,
link |
which is jargon that basically means
link |
how many sequential operations are performed
link |
to turn an input into an output.
link |
And those kinds of computations in digital systems
link |
are highly serial, meaning that data streams,
link |
they don't branch off too far to the side.
link |
You do, you have to pull some information over there
link |
and access memory from here and stuff like that.
link |
But by and large, the computation proceeds
link |
in a serial manner.
link |
It's not that way in the brain.
link |
In the brain, you're always drawing information
link |
from different places.
link |
It's much more network based computing.
link |
Neurons don't wait for their turn.
link |
They fire when they're ready to fire.
link |
And so it's asynchronous.
link |
So one of the other things about a digital system
link |
is you're performing these operations on a clock.
link |
And that's a crucial aspect of it.
link |
Get rid of a clock in a digital system,
link |
nothing makes sense anymore.
link |
The brain has no clock.
link |
It builds its own timescales based on its internal activity.
link |
So you can think of the brain as kind of like this,
link |
like network computation,
link |
where it's actually really trivial, simple computers,
link |
just a huge number of them and they're networked.
link |
I would say it is complex, sophisticated little processors
link |
and there's a huge number of them.
link |
Neurons are not, are not simple.
link |
I don't mean to offend neurons.
link |
They're very complicated and beautiful and yeah,
link |
but we often oversimplify them.
link |
Yes, they're actually like there's computation happening
link |
Right, so I would say to think of a transistor
link |
as the building block of a digital computer is accurate.
link |
You use a few transistors to make your logic gates.
link |
You build up more, you build up processors
link |
from logic gates and things like that.
link |
So you can think of a transistor
link |
as a fundamental building block,
link |
or you can think of,
link |
as we get into more highly parallelized architectures,
link |
you can think of a processor
link |
as a fundamental building block.
link |
To make the analogy to the neuro side of things,
link |
a neuron is not a transistor.
link |
A neuron is a processor.
link |
It has synapses, even synapses are not transistors,
link |
but they are more,
link |
they're lower on the information processing hierarchy
link |
They do a bulk of the computation,
link |
but neurons are entire processors in and of themselves
link |
that can take in many different kinds of inputs
link |
on many different spatial and temporal scales
link |
and produce many different kinds of outputs
link |
so that they can perform different computations
link |
in different contexts.
link |
So this is where enters this distinction
link |
between computation and communication.
link |
So you can think of neurons performing computation
link |
and the inter, the networking,
link |
the interconnectivity of neurons
link |
is communication between neurons.
link |
And you see this with very large server systems.
link |
I've been, I mentioned offline,
link |
we've been talking to Jim Keller,
link |
whose dream is to build giant computers
link |
that, you know, the bottom like there
link |
is often the communication
link |
between the different pieces of computing.
link |
So in this paper that we mentioned,
link |
Optoelectronic Intelligence,
link |
you say electrons excel at computation
link |
while light is excellent for communication.
link |
Maybe you can linger and say in this context,
link |
what do you mean by computation and communication?
link |
What are electrons, what is light
link |
and why do they excel at those two tasks?
link |
Yeah, just to first speak to computation
link |
versus communication,
link |
I would say computation is essentially taking in
link |
some information, performing operations
link |
on that information and producing new,
link |
hopefully more useful information.
link |
So for example, imagine you have a picture in front of you
link |
and there is a key in it
link |
and that's what you're looking for,
link |
for whatever reason, you wanna find the key,
link |
we all wanna find the key.
link |
So the input is that entire picture
link |
and the output might be the coordinates where the key is.
link |
So you've reduced the total amount of information you have
link |
but you found the useful information
link |
for you in that present moment,
link |
that's the useful information.
link |
And you think about this computation
link |
as the controlled synchronous sequential?
link |
Not necessarily, it could be,
link |
that could be how your system is performing the computation
link |
or it could be asynchronous,
link |
there are lots of ways to find the key.
link |
It depends on the nature of the data,
link |
it depends on, that's a very simplified example,
link |
a picture with a key in it,
link |
what about if you're in the world
link |
and you're trying to decide the best way
link |
to live your life?
link |
It might be interactive,
link |
it might be there might be some recurrence
link |
or some weird asynchrony, I got it.
link |
But there's an input and there's an output
link |
and you do some stuff in the middle
link |
that actually goes from the input to the output.
link |
You've taken in information
link |
and output different information,
link |
hopefully reducing the total amount of information
link |
and extracting what's useful.
link |
Communication is then getting that information
link |
from the location at which it's stored
link |
because information is physical as Landauer emphasized
link |
and so it is in one place
link |
and you need to get that information to another place
link |
so that something else can use it
link |
for whatever computation it's working on.
link |
Maybe it's part of the same network
link |
and you're all trying to solve the same problem
link |
but neuron A over here just deduced something
link |
based on its inputs
link |
and it's now sending that information across the network
link |
to another location
link |
so that would be the act of communication.
link |
Can you linger on Landauer
link |
and saying information is physical?
link |
Rolf Landauer, not to be confused with Lev Landauer.
link |
Yeah, and he made huge contributions
link |
to our understanding of the reversibility of information
link |
and this concept that energy has to be dissipated
link |
in computing when the computation is irreversible
link |
but if you can manage to make it reversible
link |
then you don't need to expend energy
link |
but if you do expend energy to perform a computation
link |
there's sort of a minimal amount that you have to do
link |
and it's KT log two.
link |
And it's all somehow related
link |
to the second law of thermodynamics
link |
and that the universe is an information process
link |
and then we're living in a simulation.
link |
So okay, sorry, sorry for that tangent.
link |
So that's the defining the distinction
link |
between computation and communication.
link |
Let me say one more thing just to clarify.
link |
Communication ideally does not change the information.
link |
It moves it from one place to another
link |
but it is preserved.
link |
All right, that's beautiful.
link |
So then the electron versus light distinction
link |
and why are electrons good at computation
link |
and light good at communication?
link |
Yes, there's a lot that goes into it I guess
link |
but just try to speak to the simplest part of it.
link |
Electrons interact strongly with one another.
link |
They're charged particles.
link |
So if I pile a bunch of them over here
link |
they're feeling a certain amount of force
link |
and they wanna move somewhere else.
link |
They're strongly interactive.
link |
You can also get them to sit still.
link |
You can, an electron has a mass
link |
so you can cause it to be spatially localized.
link |
So for computation that's useful
link |
because now I can make these little devices
link |
that put a bunch of electrons over here
link |
and then I change the state of a gate
link |
like I've been describing,
link |
put a different voltage on this gate
link |
and now I move the electrons over here.
link |
Now they're sitting somewhere else.
link |
I have a physical mechanism
link |
with which I can represent information.
link |
It's spatially localized and I have knobs
link |
that I can adjust to change where those electrons are
link |
or what they're doing.
link |
Light by contrast, photons of light
link |
which are the discrete packets of energy
link |
that were identified by Einstein,
link |
they do not interact with each other
link |
especially at low light levels.
link |
If you're in a medium and you have a bright high light level
link |
you can get them to interact with each other
link |
through the interaction with that medium that they're in
link |
but that's a little bit more exotic.
link |
And for the purposes of this conversation
link |
we can assume that photons don't interact with each other.
link |
So if you have a bunch of them
link |
all propagating in the same direction
link |
they don't interfere with each other.
link |
If I wanna send, if I have a communication channel
link |
and I put one more photon on it,
link |
it doesn't screw up with those other ones.
link |
It doesn't change what those other ones were doing at all.
link |
So that's really useful for communication
link |
because that means you can sort of allow
link |
a lot of these photons to flow
link |
without disruption of each other
link |
and they can branch really easily and things like that.
link |
But it's not good for computation
link |
because it's very hard for this packet of light
link |
to change what this packet of light is doing.
link |
They pass right through each other.
link |
So in computation you want to change information
link |
and if photons don't interact with each other
link |
it's difficult to get them to change the information
link |
represented by the others.
link |
So that's the fundamental difference.
link |
Is there also something about the way they travel
link |
through different materials
link |
or is that just a particular engineering?
link |
No, it's not, that's deep physics I think.
link |
So this gets back to electrons interact with each other
link |
and photons don't.
link |
So say I'm trying to get a packet of information
link |
from me to you and we have a wire going between us.
link |
In order for me to send electrons across that wire
link |
I first have to raise the voltage on my end of the wire
link |
and that means putting a bunch of charges on it
link |
and then that charge packet has to propagate along the wire
link |
and it has to get all the way over to you.
link |
That wire is gonna have something that's called capacitance
link |
which basically tells you how much charge
link |
you need to put on the wire
link |
in order to raise the voltage on it
link |
and the capacitance is gonna be proportional
link |
to the length of the wire.
link |
So the longer the length of the wire is
link |
the more charge I have to put on it
link |
and the energy required to charge up that line
link |
and move those electrons to you
link |
is also proportional to the capacitance
link |
and goes as the voltage squared.
link |
So you get this huge penalty if you wanna send electrons
link |
across a wire over appreciable distances.
link |
So distance is an important thing here
link |
when you're doing communication.
link |
Distance is an important thing.
link |
So is the number of connections I'm trying to make.
link |
Me to you, okay one, that's not so bad.
link |
If I want to now send it to 10,000 other friends
link |
then all of those wires are adding tons
link |
of extra capacitance.
link |
Now not only does it take forever
link |
to put the charge on that wire
link |
and raise the voltage on all those lines
link |
but it takes a ton of power
link |
and the number 10,000 is not randomly chosen.
link |
That's roughly how many connections
link |
each neuron in your brain makes.
link |
So a neuron in your brain needs to send 10,000 messages
link |
every time it has something to say.
link |
You can't do that if you're trying to drive electrons
link |
from here to 10,000 different places.
link |
The brain does it in a slightly different way
link |
which we can discuss.
link |
How can light achieve the 10,000 connections
link |
and why is it better?
link |
In terms of like the energy use required
link |
to use light for the communication of the 10,000 connections.
link |
So now instead of trying to send electrons
link |
from me to you, I'm trying to send photons.
link |
So I can make what's called a wave guide
link |
which is just a simple piece of a material.
link |
It could be glass like an optical fiber
link |
or silicon on a chip.
link |
And I just have to inject photons into that wave guide
link |
and independent of how long it is,
link |
independent of how many different connections I'm making,
link |
it doesn't change the voltage or anything like that
link |
that I have to raise up on the wire.
link |
So if I have one more connection,
link |
if I add additional connections,
link |
I need to add more light to the wave guide
link |
because those photons need to split
link |
and go to different paths.
link |
That makes sense but I don't have a capacitive penalty.
link |
Sometimes these are called wiring parasitics.
link |
There are no parasitics associated with light
link |
in that same sense.
link |
So this might be a dumb question
link |
but how do I catch a photon on the other end?
link |
Is it the polymer stuff you were talking about
link |
for a different application for photolithography?
link |
Like how do you catch a photon?
link |
There's a lot of ways to catch a photon.
link |
It's not a dumb question.
link |
It's a deep and important question
link |
that basically defines a lot of the work
link |
that goes on in our group at NIST.
link |
One of my group leaders, Seywoon Nam,
link |
has built his career around
link |
these superconducting single photon detectors.
link |
So if you're going to try to sort of reach a lower limit
link |
and detect just one particle of light,
link |
superconductors come back into our conversation
link |
and just picture a simple device
link |
where you have current flowing
link |
through a superconducting wire and...
link |
A loop again or no?
link |
Let's say yes, you have a loop.
link |
So you have a superconducting wire
link |
that goes straight down like this
link |
and on your loop branch, you have a little ammeter,
link |
something that measures current.
link |
There's a resistor up there too.
link |
So your current biasing this,
link |
so there's current flowing
link |
through that superconducting branch.
link |
Since there's a resistor over here,
link |
all the current goes through the superconducting branch.
link |
Now a photon comes in, strikes that superconductor.
link |
We talked about this superconducting
link |
macroscopic quantum state.
link |
That's going to be destroyed by the energy of that photon.
link |
So now that branch of the circuit is resistive too.
link |
And you've properly designed your circuit
link |
so that the resistance on that superconducting branch
link |
is much greater than the other resistance.
link |
Now all of your current's going to go that way.
link |
Your ammeter says, oh, I just got a pulse of current.
link |
That must mean I detected a photon.
link |
Then where you broke that superconductivity
link |
in a matter of a few nanoseconds,
link |
it cools back off, dissipates that energy
link |
and the current flows back
link |
through that superconducting branch.
link |
This is a very powerful superconducting device
link |
that allows us to understand quantum states of light.
link |
I didn't realize a loop like that
link |
could be sensitive to a single photon.
link |
I mean, that seems strange to me because,
link |
I mean, so what happens when you just barrage it
link |
If you put a bunch of photons in there,
link |
essentially the same thing happens.
link |
You just drive it into the normal state,
link |
it becomes resistive and it's not particularly interesting.
link |
So you have to be careful how many photons you send.
link |
Like you have to be very precise with your communication.
link |
So I would say that that's actually in the application
link |
that we're trying to use these detectors for.
link |
That's a feature because what we want is for,
link |
if a neuron sends one photon to a synaptic connection
link |
and one of these superconducting detectors is sitting there,
link |
you get this pulse of current.
link |
And that synapse says event,
link |
then I'm gonna do what I do when there's a synapse event,
link |
I'm gonna perform computations, that kind of thing.
link |
But if accidentally you send two there or three or five,
link |
it does the exact same.
link |
And so this is how in the system that we're devising here,
link |
communication is entirely binary.
link |
And that's what I tried to emphasize a second ago.
link |
Communication should not change the information.
link |
You're not saying, oh, I got this kind of communication
link |
event for photons.
link |
No, we're not keeping track of that.
link |
This neuron fired, this synapse says that neuron fired,
link |
So that's a noise filtering property of those detectors.
link |
However, there are other applications
link |
where you'd rather know the exact number of photons
link |
that can be very useful in quantum computing with light.
link |
And our group does a lot of work
link |
around another kind of superconducting sensor
link |
called a transition edge sensor that Adrian Alita
link |
in our group does a lot of work on that.
link |
And that can tell you based on the amplitude
link |
of the current pulse you divert exactly how many photons
link |
were in that pulse.
link |
What's that useful for?
link |
One way that you can encode information
link |
in quantum states of light is in the number of photons.
link |
You can have what are called number states
link |
and a number state will have a well defined number
link |
of photons and maybe the output of your quantum computation
link |
encodes its information in the number of photons
link |
that are generated.
link |
So if you have a detector that is sensitive to that,
link |
it's extremely useful.
link |
Can you achieve like a clock with photons
link |
or is that not important?
link |
Is there a synchronicity here?
link |
In general, it can be important.
link |
Clock distribution is a big challenge
link |
in especially large computational systems.
link |
And so yes, optical clocks, optical clock distribution
link |
is a very powerful technology.
link |
I don't know the state of that field right now,
link |
but I imagine that if you're trying to distribute a clock
link |
across any appreciable size computational system,
link |
you wanna use light.
link |
Yeah, I wonder how these giant systems work,
link |
especially like supercomputers.
link |
Do they need to do clock distribution
link |
or are they doing more ad hoc parallel
link |
like concurrent programming?
link |
Like there's some kind of locking mechanisms or something.
link |
That's a fascinating question,
link |
but let's zoom in at this very particular question
link |
of computation on a processor
link |
and communication between processors.
link |
So what does this system look like
link |
that you're envisioning?
link |
One of the places you're envisioning it
link |
is in the paper on optoelectronic intelligence.
link |
So what are we talking about?
link |
Are we talking about something
link |
that starts to look a lot like the human brain
link |
or does it still look a lot like a computer?
link |
What are the size of this thing?
link |
Is it going inside a smartphone or as you said,
link |
does it go inside something that's more like a house?
link |
Like what should we be imagining?
link |
What are you thinking about
link |
when you're thinking about these fundamental systems?
link |
Let me introduce the word neuromorphic.
link |
There's this concept of neuromorphic computing
link |
where what that broadly refers to
link |
is computing based on the information processing principles
link |
And as digital computing seems to be pushing
link |
towards some fundamental performance limits,
link |
people are considering architectural advances,
link |
drawing inspiration from the brain,
link |
more distributed parallel network kind of architectures
link |
And so there's this continuum of neuromorphic
link |
from things that are pretty similar to digital computers,
link |
but maybe there are more cores
link |
and the way they send messages is a little bit more
link |
like the way brain neurons send spikes.
link |
But for the most part, it's still digital electronics.
link |
And then you have some things in between
link |
where maybe you're using transistors,
link |
but now you're starting to use them
link |
instead of in a digital way, in an analog way.
link |
And so you're trying to get those circuits
link |
to behave more like neurons.
link |
And then that's a little bit,
link |
quite a bit more on the neuromorphic side of things.
link |
You're trying to get your circuits,
link |
although they're still based on silicon,
link |
you're trying to get them to perform operations
link |
that are highly analogous to the operations in the brain.
link |
And that's where a great deal of work is
link |
in neuromorphic computing,
link |
people like Giacomo Indoveri and Gert Kauenberg,
link |
Jennifer Hasler, countless others.
link |
It's a rich and exciting field going back to Carver Mead
link |
in the late 1980s.
link |
And then all the way on the other extreme of the continuum
link |
is where you say, I'll give up anything related
link |
to transistors or semiconductors or anything like that.
link |
I'm not starting with the assumption
link |
that I'm gonna use any kind
link |
of conventional computing hardware.
link |
And instead, what I wanna do is try and understand
link |
what makes the brain powerful
link |
at the kind of information processing it does.
link |
And I wanna think from first principles
link |
about what hardware is best going to enable us
link |
to capture those information processing principles
link |
in an artificial system.
link |
And that's where I live.
link |
That's where I'm doing my exploration these days.
link |
So what are the first principles
link |
of brain like computation communication?
link |
Right, yeah, this is so important
link |
and I'm glad we booked 14 hours for this because.
link |
I only have 13, I'm sorry.
link |
Okay, so the brain is notoriously complicated.
link |
And I think that's an important part
link |
of why it can do what it does.
link |
But okay, let me try to break it down.
link |
Starting with the devices, neurons, as I said before,
link |
they're sophisticated devices in and of themselves
link |
and synapses are too.
link |
They can change their state based on the activity.
link |
So they adapt over time.
link |
That's crucial to the way the brain works.
link |
They don't just adapt on one timescale,
link |
they can adapt on myriad timescales
link |
from the spacing between pulses,
link |
the spacing between spikes that come from neurons
link |
all the way to the age of the organism.
link |
Also relevant, perhaps I think the most important thing
link |
that's guided my thinking is the network structure
link |
Which can also be adjusted on different scales.
link |
Absolutely, yes, so you're making new,
link |
you're changing the strength of contacts,
link |
you're changing the spatial distribution of them,
link |
although spatial distribution doesn't change that much
link |
once you're a mature organism.
link |
But that network structure is really crucial.
link |
So let me dwell on that for a second.
link |
You can't talk about the brain without emphasizing
link |
that most of the neurons in the neocortex
link |
or the prefrontal cortex, the part of the brain
link |
that we think is most responsible for high level reasoning
link |
and things like that,
link |
those neurons make thousands of connections.
link |
So you have this network that is highly interconnected.
link |
And I think it's safe to say that one of the primary reasons
link |
that they make so many different connections
link |
is that allows information to be communicated very rapidly
link |
from any spot in the network
link |
to any other spot in the network.
link |
So that's a sort of spatial aspect of it.
link |
You can quantify this in terms of concepts
link |
that are related to fractals and scale invariants,
link |
which I think is a very beautiful concept.
link |
So what I mean by that is kind of,
link |
no matter what spatial scale you're looking at in the brain
link |
within certain bounds, you see the same
link |
general statistical pattern.
link |
So if I draw a box around some region of my cortex,
link |
most of the connections that those neurons
link |
within that box make are gonna be within the box
link |
to each other in their local neighborhood.
link |
And that's sort of called clustering, loosely speaking.
link |
But a non negligible fraction
link |
is gonna go outside of that box.
link |
And then if I draw a bigger box,
link |
the pattern is gonna be exactly the same.
link |
So you have this scale invariants,
link |
and you also have a non vanishing probability
link |
of a neuron making connection very far away.
link |
So suppose you wanna plot the probability
link |
of a neuron making a connection as a function of distance.
link |
If that were an exponential function,
link |
it would go e to the minus radius
link |
over some characteristic radius,
link |
and it would drop off up to some certain radius,
link |
the probability would be reasonably close to one,
link |
and then beyond that characteristic length R zero,
link |
it would drop off sharply.
link |
And so that would mean that the neurons in your brain
link |
are really localized, and that's not what we observe.
link |
Instead, what you see is that the probability
link |
of making a longer distance connection, it does drop off,
link |
but it drops off as a power law.
link |
So the probability that you're gonna have a connection
link |
at some radius R goes as R to the minus some power.
link |
And that's more, that's what we see with forces in nature,
link |
like the electromagnetic force
link |
between two particles or gravity
link |
goes as one over the radius squared.
link |
So you can see this in fractals.
link |
I love that there's like a fractal dynamics of the brain
link |
that if you zoom out, you draw the box
link |
and you increase that box by certain step sizes,
link |
you're gonna see the same statistics.
link |
I think that's probably very important
link |
to the way the brain processes information.
link |
It's not just in the spatial domain,
link |
it's also in the temporal domain.
link |
And what I mean by that is...
link |
That's incredible that this emerged
link |
through the evolutionary process
link |
that potentially somehow connected
link |
to the way the physics of the universe works.
link |
Yeah, I couldn't agree more that it's a deep
link |
and fascinating subject that I hope to be able
link |
to spend the rest of my life studying.
link |
You think you need to solve, understand this,
link |
this fractal nature in order to understand intelligence
link |
and communication. I do think so.
link |
I think they're deeply intertwined.
link |
Yes, I think power laws are right at the heart of it.
link |
So just to push that one through,
link |
the same thing happens in the temporal domain.
link |
So suppose your neurons in your brain
link |
were always oscillating at the same frequency,
link |
then the probability of finding a neuron oscillating
link |
as a function of frequency
link |
would be this narrowly peaked function
link |
around that certain characteristic frequency.
link |
That's not at all what we see.
link |
The probability of finding neurons oscillating
link |
or producing spikes at a certain frequency
link |
is again a power law,
link |
which means there's no defined scale
link |
of the temporal activity in the brain.
link |
At what speed do your thoughts occur?
link |
Well, there's a fastest speed they can occur
link |
and that is limited by communication and other things,
link |
but there's not a characteristic scale.
link |
We have thoughts on all temporal scales
link |
from a few tens of milliseconds,
link |
which is physiologically limited by our devices,
link |
compare that to tens of picoseconds
link |
that I talked about in superconductors,
link |
all the way up to the lifetime of the organism.
link |
You can still think about things
link |
that happened to you when you were a kid.
link |
Or if you wanna be really trippy
link |
then across multiple organisms
link |
in the entirety of human civilization,
link |
you have thoughts that span organisms, right?
link |
Yes, taking it to that level, yes.
link |
If you're willing to see the entirety of the human species
link |
as a single organism with a collective intelligence
link |
and that too on a spatial and temporal scale,
link |
there's thoughts occurring.
link |
And then if you look at not just the human species,
link |
but the entirety of life on earth
link |
as an organism with thoughts that are occurring,
link |
that are greater and greater sophisticated thoughts,
link |
there's a different spatial and temporal scale there.
link |
This is getting very suspicious.
link |
Well, hold on though, before we're done,
link |
I just wanna just tie the bow
link |
and say that the spatial and temporal aspects
link |
are intimately interrelated with each other.
link |
So activity between neurons that are very close to each other
link |
is more likely to happen on this faster timescale
link |
and information is gonna propagate
link |
and encompass more of the brain,
link |
more of your cortices, different modules in the brain
link |
are gonna be engaged in information processing
link |
on longer timescales.
link |
So there's this concept of information integration
link |
where neurons are specialized.
link |
Any given neuron or any cluster of neuron
link |
has its specific purpose,
link |
but they're also very much integrated.
link |
So you have neurons that specialize,
link |
but share their information.
link |
And so that happens through these fractal nested oscillations
link |
that occur across spatial and temporal scales.
link |
I think capturing those dynamics in hardware,
link |
to me, that's the goal of neuromorphic computing.
link |
So does it need to look,
link |
so first of all, that's fascinating.
link |
We stated some clear principles here.
link |
Now, does it have to look like the brain
link |
outside of those principles as well?
link |
Like what other characteristics
link |
have to look like the human brain?
link |
Or can it be something very different?
link |
Well, it depends on what you're trying to use it for.
link |
And so I think a lot of the community
link |
asks that question a lot.
link |
What are you gonna do with it?
link |
And I completely get it.
link |
I think that's a very important question.
link |
And it's also sometimes not the most helpful question.
link |
What if what you wanna do with it is study it?
link |
What if you just wanna see,
link |
what do you have to build into your hardware
link |
in order to observe these dynamical principles?
link |
And also, I ask myself that question every day
link |
and I'm not sure I'm able to answer that.
link |
So like, what are you gonna do
link |
with this particular neuromorphic machine?
link |
So suppose what we're trying to do with it
link |
is build something that thinks.
link |
We're not trying to get it to make us any money
link |
Maybe we'll be able to do that, but that's not our goal.
link |
Our goal is to see if we can get the same types of behaviors
link |
that we observe in our own brain.
link |
And by behaviors in this sense,
link |
what I mean the behaviors of the components,
link |
the neurons, the network, that kind of stuff.
link |
I think there's another element that I didn't really hit on
link |
that you also have to build into this.
link |
And those are architectural principles.
link |
They have to do with the hierarchical modular construction
link |
And without getting too lost in jargon,
link |
the main point that I think is relevant there,
link |
let me try and illustrate it with a cartoon picture
link |
of the architecture of the brain.
link |
So in the brain, you have the cortex,
link |
which is sort of this outer sheet.
link |
It's actually, it's a layered structure.
link |
You can, if you could take it out of your brain,
link |
you could unroll it on the table
link |
and it would be about the size of a pizza sitting there.
link |
And that's a module.
link |
It does certain things.
link |
It processes as Yogi Buzaki would say,
link |
it processes the what of what's going on around you.
link |
But you have another really crucial module
link |
that's called the hippocampus.
link |
And that network is structured entirely differently.
link |
First of all, this cortex that had described
link |
10 billion neurons in there.
link |
So numbers matter here.
link |
And they're organized in that sort of power law distribution
link |
where the probability of making a connection drops off
link |
as a power law in space.
link |
The hippocampus is another module that's important
link |
for understanding how, where you are,
link |
when you are keeping track of your position
link |
in space and time.
link |
And that network is very much random.
link |
So the probability of making a connection,
link |
it almost doesn't even drop off as a function of distance.
link |
It's the same probability that you'll make it here
link |
to over there, but there are only about 100 million neurons
link |
there, so you can have that huge densely connected module
link |
because it's not so big.
link |
And the neocortex or the cortex and the hippocampus,
link |
they talk to each other constantly.
link |
And that communication is largely facilitated
link |
by what's called the thalamus.
link |
I'm not a neuroscientist here.
link |
I'm trying to do my best to recite things.
link |
Cartoon picture of the brain, I gotcha.
link |
Yeah, something like that.
link |
So this thalamus is coordinating the activity
link |
between the neocortex and the hippocampus
link |
and making sure that they talk to each other
link |
at the right time and send messages
link |
that will be useful to one another.
link |
So this all taken together is called
link |
the thalamocortical complex.
link |
And it seems like building something like that
link |
is going to be crucial to capturing the types of activity
link |
we're looking for because those responsibilities,
link |
those separate modules, they do different things,
link |
that's gotta be central to achieving these states
link |
of efficient information integration across space and time.
link |
By the way, I am able to achieve this state
link |
by watching simulations, visualizations
link |
of the thalamocortical complex.
link |
There's a few people I forget from where.
link |
They've created these incredible visual illustrations
link |
of visual stimulation from the eye or something like that.
link |
And this image flowing through the brain.
link |
Wow, I haven't seen that, I gotta check that out.
link |
So it's one of those things,
link |
you find this stuff in the world,
link |
and you see on YouTube, it has 1,000 views,
link |
these visualizations of the human brain
link |
processing information.
link |
And because there's chemistry there,
link |
because this is from actual human brains,
link |
I don't know how they're doing the coloring,
link |
but they're able to actually trace
link |
the different, the chemical and the electrical signals
link |
throughout the brain, and the visual thing,
link |
it's like, whoa, because it looks kinda like the universe,
link |
I mean, the whole thing is just incredible.
link |
I recommend it highly, I'll probably post a link to it.
link |
But you can just look for, one of the things they simulate
link |
is the thalamocortical complex and just visualization.
link |
You can find that yourself on YouTube, but it's beautiful.
link |
The other question I have for you is,
link |
how does memory play into all of this?
link |
Because all the signals sending back and forth,
link |
that's computation and communication,
link |
but that's kinda like processing of inputs and outputs,
link |
to produce outputs in the system,
link |
that's kinda like maybe reasoning,
link |
maybe there's some kind of recurrence.
link |
But is there a storage mechanism that you think about
link |
in the context of neuromorphic computing?
link |
Yeah, absolutely, so that's gotta be central.
link |
You have to have a way that you can store memories.
link |
And there are a lot of different kinds
link |
of memory in the brain.
link |
That's yet another example of how it's not a simple system.
link |
So there's one kind of memory,
link |
one way of talking about memory,
link |
usually starts in the context of Hopfield networks.
link |
You were lucky to talk to John Hopfield on this program.
link |
But the basic idea there is working memory
link |
is stored in the dynamical patterns
link |
of activity between neurons.
link |
And you can think of a certain pattern of activity
link |
as an attractor, meaning if you put in some signal
link |
that's similar enough to other
link |
previously experienced signals like that,
link |
then you're going to converge to the same network dynamics
link |
and you will see these neurons
link |
participate in the same network patterns of activity
link |
that they have in the past.
link |
So you can talk about the probability
link |
that different inputs will allow you to converge
link |
to different basins of attraction
link |
and you might think of that as,
link |
oh, I saw this face and then I excited
link |
this network pattern of activity
link |
because last time I saw that face,
link |
I was at some movie and that's a famous person
link |
that's on the screen or something like that.
link |
So that's one memory storage mechanism.
link |
But crucial to the ability to imprint those memories
link |
in your brain is the ability to change
link |
the strength of connection between one neuron and another,
link |
that synaptic connection between them.
link |
So synaptic weight update is a massive field of neuroscience
link |
and neuromorphic computing as well.
link |
So there are two poles on that spectrum.
link |
Okay, so more in the language of machine learning,
link |
we would talk about supervised and unsupervised learning.
link |
And when I'm trying to tie that down
link |
to neuromorphic computing,
link |
I will use a definition of supervised learning,
link |
which basically means the external user,
link |
the person who's controlling this hardware
link |
has some knob that they can tune
link |
to change each of the synaptic weights,
link |
depending on whether or not the network
link |
is doing what you want it to do.
link |
Whereas what I mean in this conversation
link |
when I say unsupervised learning
link |
is that those synaptic weights
link |
are dynamically changing in your network
link |
based on nothing that the user is doing,
link |
nothing that there's no wire from the outside
link |
going into any of those synapses.
link |
The network itself is reconfiguring those synaptic weights
link |
based on physical properties
link |
that you've built into the devices.
link |
So if the synapse receives a pulse from here
link |
and that causes the neuron to spike,
link |
some circuit built in there with no help from me
link |
or anybody else adjust the weight
link |
in a way that makes it more likely
link |
to store the useful information
link |
and excite the useful network patterns
link |
and makes it less likely that random noise,
link |
useless communication events
link |
will have an important effect on the network activity.
link |
So there's memory encoded in the weights,
link |
the synaptic weights.
link |
What about the formation of something
link |
that's not often done in machine learning,
link |
the formation of new synaptic connections?
link |
Right, well, that seems to,
link |
so again, not a neuroscientist here,
link |
but my reading of the literature
link |
is that that's particularly crucial
link |
in early stages of brain development
link |
where a newborn is born
link |
with tons of extra synaptic connections
link |
and it's actually pruned over time.
link |
So the number of synapses decreases
link |
as opposed to growing new long distance connections.
link |
It is possible in the brain to grow new neurons
link |
and assign new synaptic connections
link |
but it doesn't seem to be the primary mechanism
link |
by which the brain is learning.
link |
So for example, like right now,
link |
sitting here talking to you,
link |
you say lots of interesting things
link |
and I learn from you
link |
and I can remember things that you just said
link |
and I didn't grow new axonal connections
link |
down to new synapses to enable those.
link |
It's plasticity mechanisms
link |
in the synaptic connections between neurons
link |
that enable me to learn on that timescale.
link |
So at the very least,
link |
you can sufficiently approximate that
link |
with just weight updates.
link |
You don't need to form new connections.
link |
I would say weight updates are a big part of it.
link |
I also think there's more
link |
because broadly speaking,
link |
when we're doing machine learning,
link |
our networks, say we're talking about feed forward,
link |
deep neural networks,
link |
the temporal domain is not really part of it.
link |
Okay, you're gonna put in an image
link |
and you're gonna get out a classification
link |
and you're gonna do that as fast as possible.
link |
So you care about time
link |
but time is not part of the essence of this thing really.
link |
Whereas in spiking neural networks,
link |
what we see in the brain,
link |
time is as crucial as space
link |
and they're intimately intertwined
link |
as I've tried to say.
link |
And so adaptation on different timescales
link |
is important not just in memory formation,
link |
although it plays a key role there,
link |
but also in just keeping the activity
link |
in a useful dynamic range.
link |
So you have other plasticity mechanisms,
link |
not just weight update,
link |
or at least not on the timescale
link |
of many action potentials,
link |
but even on the shorter timescale.
link |
So a synapse can become much less efficacious.
link |
It can transmit a weaker signal
link |
after the second, third, fourth,
link |
that can second, third, fourth action potential
link |
to occur in a sequence.
link |
So that's what's called short term synaptic plasticity,
link |
which is a form of learning.
link |
You're learning that I'm getting too much stimulus
link |
from looking at something bright right now.
link |
So I need to tone that down.
link |
There's also another really important mechanism
link |
in learning that's called metoplasticity.
link |
What that seems to be is a way
link |
that you change not the weights themselves,
link |
but the rate at which the weights change.
link |
So when I am in say a lecture hall and my,
link |
this is a potentially terrible cartoon example,
link |
but let's say I'm in a lecture hall
link |
and it's time to learn, right?
link |
So my brain will release more,
link |
perhaps dopamine or some neuromodulator
link |
that's gonna change the rate
link |
at which synaptic plasticity occurs.
link |
So that can make me more sensitive
link |
to learning at certain times,
link |
more sensitive to overriding previous information
link |
and less sensitive at other times.
link |
And finally, as long as I'm rattling off the list,
link |
I think another concept that falls in the category
link |
of learning or memory adaptation is homeostasis
link |
or homeostatic adaptation,
link |
where neurons have the ability
link |
to control their firing rate.
link |
So if one neuron is just like blasting way too much,
link |
it will naturally tone itself down.
link |
Its threshold will adjust
link |
so that it stays in a useful dynamical range.
link |
And we see that that's captured in deep neural networks
link |
where you don't just change the synaptic weights,
link |
but you can also move the thresholds of simple neurons
link |
And so to achieve the spiking neural networks,
link |
you want to implement the first principles
link |
that you mentioned of the temporal
link |
and the spatial fractal dynamics here.
link |
So you can communicate locally,
link |
you can communicate across much greater distances
link |
and do the same thing in space
link |
and do the same thing in time.
link |
Now, you have like a chapter called
link |
Superconducting Hardware for Neuromorphic Computing.
link |
So what are some ideas that integrate
link |
some of the things we've been talking about
link |
in terms of the first principles of neuromorphic computing
link |
and the ideas that you outline
link |
in optoelectronic intelligence?
link |
Yeah, so let me start, I guess,
link |
on the communication side of things,
link |
because that's what led us down this track
link |
in the first place.
link |
By us, I'm talking about my team of colleagues at NIST,
link |
Saeed Han, Bryce Brimavera, Sonia Buckley,
link |
Jeff Chiles, Adam McCallum to name,
link |
Alex Tate to name a few,
link |
our group leaders, Saewoo Nam and Rich Mirren.
link |
We've all contributed to this.
link |
So this is not me saying necessarily
link |
just the things that I've proposed,
link |
but sort of where our team's thinking
link |
has evolved over the years.
link |
Can I quickly ask, what is NIST
link |
and where is this amazing group of people located?
link |
NIST is the National Institute of Standards and Technology.
link |
The larger facility is out in Gaithersburg, Maryland.
link |
Our team is located in Boulder, Colorado.
link |
NIST is a federal agency under the Department of Commerce.
link |
We do a lot with, by we, I mean other people at NIST,
link |
do a lot with standards,
link |
making sure that we understand the system of units,
link |
international system of units, precision measurements.
link |
There's a lot going on in electrical engineering,
link |
And it's historic.
link |
I mean, it's one of those, it's like MIT
link |
or something like that.
link |
It has a reputation over many decades
link |
of just being this really a place
link |
where there's a lot of brilliant people have done
link |
a lot of amazing things.
link |
But in terms of the people in your team,
link |
in this team of people involved
link |
in the concept we're talking about now,
link |
what kind of disciplines are we talking about?
link |
Mostly physicists and electrical engineers,
link |
some material scientists,
link |
yeah, I think physicists and electrical engineers,
link |
my background is in photonics,
link |
the use of light for technology.
link |
So coming from there, I tend to have found colleagues
link |
that are more from that background.
link |
Although Adam McConn,
link |
more of a superconducting electronics background,
link |
we need a diversity of folks.
link |
This project is sort of cross disciplinary.
link |
I would love to be working more
link |
with neuroscientists and things,
link |
but we haven't reached that scale yet.
link |
You're focused on the hardware side,
link |
which requires all the disciplines that you mentioned.
link |
And then of course,
link |
neuroscientists may be a source of inspiration
link |
for some of the longterm vision.
link |
I would actually call it more than inspiration.
link |
I would call it sort of a roadmap.
link |
We're not trying to build exactly the brain,
link |
but I don't think it's enough to just say,
link |
oh, neurons kind of work like that.
link |
Let's kind of do that thing.
link |
I mean, we're very much following the concepts
link |
that the cognitive sciences have laid out for us,
link |
which I believe is a really robust roadmap.
link |
I mean, just on a little bit of a tangent,
link |
it's often stated that we just don't understand the brain.
link |
And so it's really hard to replicate it
link |
because we just don't know what's going on there.
link |
And maybe five or seven years ago,
link |
I would have said that,
link |
but as I got more interested in the subject,
link |
I read more of the neuroscience literature
link |
and I was just taken by the exact opposite sense.
link |
I can't believe how much they know about this.
link |
I can't believe how mathematically rigorous
link |
and sort of theoretically complete
link |
a lot of the concepts are.
link |
That's not to say we understand consciousness
link |
or we understand the self or anything like that,
link |
but what is the brain doing
link |
and why is it doing those things?
link |
Neuroscientists have a lot of answers to those questions.
link |
So if you're a hardware designer
link |
that just wants to get going,
link |
whoa, it's pretty clear which direction to go in, I think.
link |
Okay, so I love the optimism behind that,
link |
but in the implementation of these systems
link |
that uses superconductivity, how do you make it happen?
link |
So to me, it starts with thinking
link |
about the communication network.
link |
You know for sure that the ability of each neuron
link |
to communicate to many thousands of colleagues
link |
across the network is indispensable.
link |
I take that as a core principle of my architecture,
link |
my thinking on the subject.
link |
So coming from a background in photonics,
link |
it was very natural to say,
link |
okay, we're gonna use light for communication.
link |
Just in case listeners may not know,
link |
light is often used in communication.
link |
I mean, if you think about radio, that's light,
link |
it's long wavelengths, but it's electromagnetic radiation.
link |
It's the same physical phenomenon
link |
obeying exactly the same Maxwell's equations.
link |
And then all the way down to fiber, fiber optics.
link |
Now you're using visible
link |
or near infrared wavelengths of light,
link |
but the way you send messages across the ocean
link |
is now contemporary over optical fibers.
link |
So using light for communication is not a stretch.
link |
It makes perfect sense.
link |
So you might ask, well, why don't you use light
link |
for communication in a conventional microchip?
link |
And the answer to that is, I believe, physical.
link |
If we had a light source on a silicon chip
link |
that was as simple as a transistor,
link |
there would not be a processor in the world
link |
that didn't use light for communication,
link |
at least above some distance.
link |
How many light sources are needed?
link |
Oh, you need a light source at every single point.
link |
A light source per neuron.
link |
Per neuron, per little,
link |
but then if you could have a really small
link |
and nice light source,
link |
your definition of neuron could be flexible.
link |
Could be, yes, yes.
link |
Sometimes it's helpful to me to say,
link |
in this hardware, a neuron is that entity
link |
which has a light source.
link |
That, and I can explain.
link |
And then there was light.
link |
I mean, I can explain more about that, but.
link |
Somehow this like rhymes with consciousness
link |
because people will often say the light of consciousness.
link |
So that consciousness is that which is conscious.
link |
That's not my quote.
link |
That's me, that's my quote.
link |
You see, that quote comes from my background.
link |
Yours is in optics, mine in light, mine's in darkness.
link |
So the point I was making there is that
link |
if it was easy to manufacture light sources
link |
along with transistors on a silicon chip,
link |
they would be everywhere.
link |
And it's not easy.
link |
People have been trying for decades
link |
and it's actually extremely difficult.
link |
I think an important part of our research
link |
is dwelling right at that spot there.
link |
Is it physics or engineering?
link |
So, okay, so it's physics, I think.
link |
So what I mean by that is, as we discussed,
link |
silicon is the material of choice for transistors
link |
and it's very difficult to imagine
link |
that that's gonna change anytime soon.
link |
Silicon is notoriously bad at emitting light.
link |
And that has to do with the immutable properties
link |
of silicon itself.
link |
The way that the energy bands are structured in silicon,
link |
you're never going to make silicon efficient
link |
as a light source at room temperature
link |
without doing very exotic things
link |
that degrade its ability to interface nicely
link |
with those transistors in the first place.
link |
So that's like one of these things where it's,
link |
why is nature dealing us that blow?
link |
You give us these beautiful transistors
link |
and you give us all the motivation
link |
to use light for communication,
link |
but then you don't give us a light source.
link |
So, well, okay, you do give us a light source.
link |
Compound semiconductors,
link |
like we talked about back at the beginning,
link |
an element from group three and an element from group five
link |
form an alloy where every other lattice site
link |
switches which element it is.
link |
Those have much better properties for generating light.
link |
You put electrons in, light comes out.
link |
Almost 100% of the electron hold,
link |
it can be made efficient.
link |
I'll take your word for it, okay.
link |
However, I say it's physics, not engineering,
link |
because it's very difficult
link |
to get those compound semiconductor light sources
link |
situated with your silicon.
link |
In order to do that ion implantation
link |
that I talked about at the beginning,
link |
high temperatures are required.
link |
So you gotta make all of your transistors first
link |
and then put the compound semiconductors on top of there.
link |
You can't grow them afterwards
link |
because that requires high temperature.
link |
It screws up all your transistors.
link |
You try and stick them on there.
link |
They don't have the same lattice constant.
link |
The spacing between atoms is different enough
link |
that it just doesn't work.
link |
So nature does not seem to be telling us that,
link |
hey, go ahead and combine light sources
link |
with your digital switches
link |
for conventional digital computing.
link |
And conventional digital computing
link |
will often require smaller scale, I guess,
link |
in terms of like smartphone.
link |
So in which kind of systems does nature hint
link |
that we can use light and photons for communication?
link |
Well, so let me just try and be clear.
link |
You can use light for communication in digital systems,
link |
just the light sources are not intimately integrated
link |
You manufacture all the silicon,
link |
you have your microchip, plunk it down.
link |
And then you manufacture your light sources,
link |
separate chip, completely different process
link |
made in a different foundry.
link |
And then you put those together at the package level.
link |
So now you have some,
link |
I would say a great deal of architectural limitations
link |
that are introduced by that sort of
link |
package level integration
link |
as opposed to monolithic on the same chip integration,
link |
but it's still a very useful thing to do.
link |
And that's where I had done some work previously
link |
before I came to NIST.
link |
There's a project led by Vladimir Stoyanovich
link |
that now spun out into a company called IR Labs
link |
led by Mark Wade and Chen Sun
link |
where they're doing exactly that.
link |
So you have your light source chip,
link |
your silicon chip, whatever it may be doing,
link |
maybe it's digital electronics,
link |
maybe it's some other control purpose, something.
link |
And the silicon chip drives the light source chip
link |
and modulates the intensity of the lights.
link |
You can get data out of the package on an optical fiber.
link |
And that still gives you tremendous advantages in bandwidth
link |
as opposed to sending those signals out
link |
over electrical lines.
link |
But it is somewhat peculiar to my eye
link |
that they have to be integrated at this package level.
link |
And those people, I mean, they're so smart.
link |
Those are my colleagues that I respect a great deal.
link |
So it's very clear that it's not just
link |
they're making a bad choice.
link |
This is what physics is telling us.
link |
It just wouldn't make any sense
link |
to try to stick them together.
link |
Yeah, so even if it's difficult,
link |
it's easier than the alternative, unfortunately.
link |
And again, I need to go back
link |
and make sure that I'm not taking the wrong way.
link |
I'm not saying that the pursuit
link |
of integrating compound semiconductors with silicon
link |
is fruitless and shouldn't be pursued.
link |
It should, and people are doing great work.
link |
Kai Mei Lau and John Bowers, others,
link |
they're doing it and they're making progress.
link |
But to my eye, it doesn't look like that's ever going to be
link |
just the standard monolithic light source
link |
on silicon process.
link |
I just don't see it.
link |
Yeah, so nature kind of points the way usually.
link |
And if you resist nature,
link |
you're gonna have to do a lot more work.
link |
And it's gonna be expensive and not scalable.
link |
But okay, so let's go far into the future.
link |
Let's imagine this gigantic neuromorphic computing system
link |
that simulates all of our realities.
link |
It currently is Mantra Matrix 4.
link |
So this thing, this powerful computer,
link |
how does it operate?
link |
So what are the neurons?
link |
What is the communication?
link |
What's your sense?
link |
All right, so let me now,
link |
after spending 45 minutes trashing
link |
light source integration with silicon,
link |
let me now say why I'm basing my entire life,
link |
professional life, on integrating light sources
link |
I think the game is completely different
link |
when you're talking about superconducting electronics.
link |
For several reasons, let me try to go through them.
link |
One is that, as I mentioned,
link |
it's difficult to integrate
link |
those compound semiconductor light sources with silicon.
link |
With silicon is a requirement that is introduced
link |
by the fact that you're using semiconducting electronics.
link |
In superconducting electronics,
link |
you're still gonna start with a silicon wafer,
link |
but it's just the bread for your sandwich in a lot of ways.
link |
You're not using that silicon
link |
in precisely the same way for the electronics.
link |
You're now depositing superconducting materials
link |
The prospects for integrating light sources
link |
with that kind of an electronic process
link |
are certainly less explored,
link |
but I think much more promising
link |
because you don't need those light sources
link |
to be intimately integrated with the transistors.
link |
That's where the problems come up.
link |
They don't need to be lattice matched to the silicon,
link |
all that kind of stuff.
link |
Instead, it seems possible
link |
that you can take those compound semiconductor light sources,
link |
stick them on the silicon wafer,
link |
and then grow your superconducting electronics
link |
on the top of that.
link |
It's at least not obviously going to fail.
link |
So the computation would be done
link |
on the superconductive material as well?
link |
Yes, the computation is done
link |
in the superconducting electronics,
link |
and the light sources receive signals
link |
that say, hey, a neuron reached threshold,
link |
produce a pulse of light,
link |
send it out to all your downstream synaptic connections.
link |
Those are, again, superconducting electronics.
link |
Perform your computation,
link |
and you're off to the races.
link |
Your network works.
link |
So then if we can rewind real quick,
link |
so what are the limitations of the challenges
link |
of superconducting electronics
link |
when we think about constructing these kinds of systems?
link |
So actually, let me say one other thing
link |
about the light sources,
link |
and then I'll move on, I promise,
link |
because this is probably tedious for some.
link |
This is super exciting.
link |
Okay, one other thing about the light sources.
link |
I said that silicon is terrible at emitting photons.
link |
It's just not what it's meant to do.
link |
However, the game is different
link |
when you're at low temperature.
link |
If you're working with superconductors,
link |
you have to be at low temperature
link |
because they don't work otherwise.
link |
When you're at four Kelvin,
link |
silicon is not obviously a terrible light source.
link |
It's still not as efficient as compound semiconductors,
link |
but it might be good enough for this application.
link |
The final thing that I'll mention about that is, again,
link |
leveraging superconductors, as I said,
link |
in a different context,
link |
superconducting detectors can receive one single photon.
link |
In that conversation, I failed to mention
link |
that semiconductors can also receive photons.
link |
That's the primary mechanism by which it's done.
link |
A camera in your phone that's receptive to visible light
link |
is receiving photons.
link |
It's based on silicon,
link |
or you can make it in different semiconductors
link |
for different wavelengths,
link |
but it requires on the order of a thousand,
link |
a few thousand photons to receive a pulse.
link |
Now, when you're using a superconducting detector,
link |
you need one photon, exactly one.
link |
I mean, one or more.
link |
So the fact that your synapses can now be based
link |
on superconducting detectors
link |
instead of semiconducting detectors
link |
brings the light levels that are required
link |
down by some three orders of magnitude.
link |
So now you don't need good light sources.
link |
You can have the world's worst light sources.
link |
As long as they spit out maybe a few thousand photons
link |
every time a neuron fires,
link |
you have the hardware principles in place
link |
that you might be able to perform
link |
this optoelectronic integration.
link |
To me optoelectronic integration is, it's just so enticing.
link |
We want to be able to leverage electronics for computation,
link |
light for communication,
link |
working with silicon microelectronics at room temperature
link |
that has been exceedingly difficult.
link |
And I hope that when we move to the superconducting domain,
link |
target a different application space
link |
that is neuromorphic instead of digital
link |
and use superconducting detectors,
link |
maybe optoelectronic integration comes to us.
link |
Okay, so there's a bunch of questions.
link |
So one is temperature.
link |
So in these kinds of hybrid heterogeneous systems,
link |
what's the temperature?
link |
What are some of the constraints to the operation here?
link |
Does it all have to be a four Kelvin as well?
link |
Everything has to be at four Kelvin.
link |
Okay, so what are the other engineering challenges
link |
of making this kind of optoelectronic systems?
link |
Let me just dwell on that four Kelvin for a second
link |
because some people hear four Kelvin
link |
and they just get up and leave.
link |
They just say, I'm not doing it, you know?
link |
And to me, that's very earth centric, species centric.
link |
We live in 300 Kelvin.
link |
So we want our technologies to operate there too.
link |
Yeah, what's zero Celsius?
link |
Zero Celsius is 273 Kelvin.
link |
So we're talking very, very cold here.
link |
Not even Boston cold.
link |
This is real cold.
link |
Okay, so just for reference,
link |
the temperature of the cosmic microwave background
link |
is about 2.7 Kelvin.
link |
So we're still warmer than deep space.
link |
So that when the universe dies out,
link |
it'll be colder than four K.
link |
It's already colder than four K.
link |
In the expanses, you know,
link |
you don't have to get that far away from the earth
link |
in order to drop down to not far from four Kelvin.
link |
So what you're saying is the aliens that live at the edge
link |
of the observable universe
link |
are using superconductive material for their computation.
link |
They don't have to live at the edge of the universe.
link |
The aliens that are more advanced than us
link |
in their solar system are doing this
link |
in their asteroid belt.
link |
We can get to that.
link |
Oh, because they can get that
link |
to that temperature easier there?
link |
All you have to do is reflect the sunlight away
link |
and you have a huge headstart.
link |
Oh, so the sun is the problem here.
link |
Like it's warm here on earth.
link |
Okay, so can you...
link |
So how do we get to four K?
link |
Well, okay, so what I want to say about temperature...
link |
What I want to say about temperature is that
link |
if you can swallow that,
link |
if you can say, all right, I give up applications
link |
that have to do with my cell phone
link |
and the convenience of a laptop on a train
link |
and you instead...
link |
For me, I'm very much in the scientific head space.
link |
I'm not looking at products.
link |
I'm not looking at what this will be useful
link |
to sell to consumers.
link |
Instead, I'm thinking about scientific questions.
link |
Well, it's just not that bad to have to work at four Kelvin.
link |
We do it all the time in our labs at NIST.
link |
I mean, for reference,
link |
the entire quantum computing sector
link |
usually has to work at something like 100 millikelvin,
link |
So now you're talking of another factor of 100
link |
even colder than that, a fraction of a degree.
link |
And everybody seems to think quantum computing
link |
is going to take over the world.
link |
It's so much more expensive
link |
to have to get that extra factor of 10 or whatever colder.
link |
And yet it's not stopping people from investing in that area.
link |
And by investing, I mean putting their research into it
link |
as well as venture capital or whatever.
link |
Oh, so based on the energy of what you're commenting on,
link |
I'm getting a sense that's one of the criticism
link |
of this approach is 4K, 4 Kelvin is a big negative.
link |
It is the showstopper for a lot of people.
link |
They just, I mean, and understandably,
link |
I'm not saying that that's not a consideration.
link |
Okay, so different motivations for different people.
link |
In the academic world,
link |
suppose you spent your whole life
link |
learning about silicon microelectronic circuits.
link |
You send a design to a foundry,
link |
they send you back a chip
link |
and you go test it at your tabletop.
link |
And now I'm saying,
link |
here now learn how to use all these cryogenics
link |
so you can do that at 4 Kelvin.
link |
I don't wanna do that.
link |
It's the old momentum, the Titanic of the turning.
link |
But you're saying that's not too much of a...
link |
When we're looking at large systems
link |
and the gain you can potentially get from them,
link |
that's not that much of a cost.
link |
And when you wanna answer the scientific question
link |
about what are the physical limits of cognition?
link |
Well, the physical limits,
link |
they don't care if you're at 4 Kelvin.
link |
If you can perform cognition at a scale
link |
orders of magnitude beyond any room temperature technology,
link |
but you gotta get cold to do it,
link |
you're gonna do it.
link |
And to me, that's the interesting application space.
link |
It's not even an application space,
link |
that's the interesting scientific paradigm.
link |
So I personally am not going to let low temperature
link |
stop me from realizing a technological domain or realm
link |
that is achieving in most ways everything else
link |
that I'm looking for in my hardware.
link |
So that, okay, that's a big one.
link |
Is there other kind of engineering challenges
link |
that you envision?
link |
So let me take a moment here
link |
because I haven't really described what I mean
link |
by a neuron or a network in this particular hardware.
link |
Yeah, do you wanna talk about loop neurons
link |
and there's so many fascinating...
link |
But you just have so many amazing papers
link |
that people should definitely check out
link |
and the titles alone are just killer.
link |
So anyway, go ahead.
link |
Right, so let me say big picture,
link |
based on optics, photonics for communication,
link |
superconducting electronics for computation,
link |
how does this all work?
link |
So a neuron in this hardware platform
link |
can be thought of as circuits
link |
that are based on Josephson junctions,
link |
like we talked about before,
link |
where every time a photon comes in...
link |
So let's start by talking about a synapse.
link |
A synapse receives a photon, one or more,
link |
from a different neuron
link |
and it converts that optical signal
link |
to an electrical signal.
link |
The amount of current that that adds to a loop
link |
is controlled by the synaptic weight.
link |
So as I said before,
link |
you're popping fluxons into a loop, right?
link |
So a photon comes in,
link |
it hits a superconducting single photon detector,
link |
one photon, the absolute physical minimum
link |
that you can communicate
link |
from one place to another with light.
link |
And that detector then converts that
link |
into an electrical signal
link |
and the amount of signal
link |
is correlated with some kind of weight.
link |
Yeah, so the synaptic weight will tell you
link |
how many fluxons you pop into the loop.
link |
It's an analog number.
link |
We're doing analog computation now.
link |
Well, can you just linger on that?
link |
What the heck is a fluxon?
link |
Are we supposed to know this?
link |
Or is this a funny,
link |
is this like the big bang?
link |
Is this a funny word for something deeply technical?
link |
No, let's try to avoid using the word fluxon
link |
because it's not actually necessary.
link |
It's fun to say though.
link |
So it's very necessary, I would say.
link |
When a photon hits
link |
that superconducting single photon detector,
link |
current is added to a superconducting loop.
link |
And the amount of current that you add
link |
is an analog value,
link |
can have eight bit equivalent resolution,
link |
something like that.
link |
That's amazing, by the way.
link |
This is starting to make a lot more sense.
link |
When you're using superconductors for this,
link |
the energy of that circulating current
link |
is less than the energy of that photon.
link |
So your energy budget is not destroyed
link |
by doing this analog computation.
link |
So now in the language of a neuroscientist,
link |
you would say that's your postsynaptic signal.
link |
You have this current being stored in a loop.
link |
You can decide what you wanna do with it.
link |
Most likely you're gonna have it decay exponentially.
link |
So every single synapse
link |
is gonna have some given time constant.
link |
And that's determined by putting some resistor
link |
in that superconducting loop.
link |
So a synapse event occurs when a photon strikes a detector,
link |
adds current to that loop, it decays over time.
link |
That's the postsynaptic signal.
link |
Then you can process that in a dendritic tree.
link |
Bryce Primavera and I have a paper
link |
that we've submitted about that.
link |
For the more neuroscience oriented people,
link |
there's a lot of dendritic processing,
link |
a lot of plasticity mechanisms you can implement
link |
with essentially exactly the same circuits.
link |
You have this one simple building block circuit
link |
that you can use for a synapse, for a dendrite,
link |
for the neuron cell body, for all the plasticity functions.
link |
It's all based on the same building block,
link |
just tweaking a couple parameters.
link |
So this basic building block
link |
has both an optical and an electrical component,
link |
and then you just build arbitrary large systems with that?
link |
Close, you're not at fault
link |
for thinking that that's what I meant.
link |
What I should say is that if you want it to be a synapse,
link |
you tack a superconducting detector onto the front of it.
link |
And if you want it to be anything else,
link |
there's no optical component.
link |
Got it, so at the front,
link |
optics in the front, electrical stuff in the back.
link |
Electrical, yeah, in the processing
link |
and in the output signal that it sends
link |
to the next stage of processing further.
link |
So the dendritic trees is electrical.
link |
It's all electrical.
link |
It's all electrical in the superconducting domain.
link |
For anybody who's up on their superconducting circuits,
link |
it's just based on a DC squid, the most ubiquitous,
link |
which is a circuit composed of two Joseph's injunctions.
link |
So it's a very bread and butter kind of thing.
link |
And then the only place where you go beyond that
link |
is the neuron cell body itself.
link |
It's receiving all these electrical inputs
link |
from the synapses or dendrites
link |
or however you've structured that particular unique neuron.
link |
And when it reaches its threshold,
link |
which occurs by driving a Joseph's injunction
link |
above its critical current,
link |
it produces a pulse of current,
link |
which starts an amplification sequence,
link |
voltage amplification,
link |
that produces light out of a transmitter.
link |
So one of our colleagues, Adam McCann,
link |
and Sonia Buckley as well,
link |
did a lot of work on the light sources
link |
and the amplifiers that drive the current
link |
and produce sufficient voltage to drive current
link |
through that now semiconducting part.
link |
So that light source is the semiconducting part of a neuron.
link |
And that, so the neuron has reached threshold.
link |
It produces a pulse of light.
link |
That light then fans out across a network of wave guides
link |
to reach all the downstream synaptic terminals
link |
that perform this process themselves.
link |
So it's probably worth explaining
link |
what a network of wave guides is,
link |
because a lot of listeners aren't gonna know that.
link |
Look up the papers by Jeff Chiles on this one.
link |
But basically, light can be guided in a simple,
link |
basically wire of usually an insulating material.
link |
So silicon, silicon nitride,
link |
different kinds of glass,
link |
just like in a fiber optic, it's glass, silicon dioxide.
link |
That makes it a little bit big.
link |
We wanna bring these down.
link |
So we use different materials like silicon nitride,
link |
but basically just imagine a rectangle of some material
link |
that just goes and branches,
link |
forms different branch points
link |
that target different subregions of the network.
link |
You can transition between layers of these.
link |
So now we're talking about building in the third dimension,
link |
which is absolutely crucial.
link |
So that's what wave guides are.
link |
Yeah, that's great.
link |
Why the third dimension is crucial?
link |
Okay, so yes, you were talking about
link |
what are some of the technical limitations.
link |
One of the things that I believe we have to grapple with
link |
is that our brains are miraculously compact.
link |
For the number of neurons that are in our brain,
link |
it sure does fit in a small volume,
link |
as it would have to if we're gonna be biological organisms
link |
that are resource limited and things like that.
link |
Any kind of hardware neuron
link |
is almost certainly gonna be much bigger than that
link |
if it is of comparable complexity,
link |
whether it's based on silicon transistors.
link |
Okay, a transistor, seven nanometers,
link |
that doesn't mean a semiconductor based neuron
link |
is seven nanometers.
link |
They require many transistors,
link |
different other things like capacitors and things
link |
that store charge.
link |
They end up being on the order of 100 microns
link |
and it's difficult to get them down any smaller than that.
link |
The same is true for superconducting neurons,
link |
and the same is true
link |
if we're trying to use light for communication.
link |
Even if you're using electrons for communication,
link |
you have these wires where, okay,
link |
the size of an electron might be angstroms,
link |
but the size of a wire is not angstroms,
link |
and if you try and make it narrower,
link |
the resistance just goes up,
link |
so you don't actually win.
link |
To communicate over long distances,
link |
you need your wires to be microns wide,
link |
and it's the same thing for wave guides.
link |
Wave guides are essentially limited
link |
by the wavelength of light,
link |
and that's gonna be about a micron,
link |
so whereas compare that to an axon,
link |
the analogous component in the brain,
link |
which is 10 nanometers in diameter, something like that,
link |
they're bigger when they need to communicate
link |
over long distances,
link |
but grappling with the size of these structures
link |
is inevitable and crucial,
link |
and so in order to make systems of comparable scale
link |
to the human brain, by scale here,
link |
I mean number of interconnected neurons,
link |
you absolutely have to be using
link |
the third spatial dimension,
link |
and that means on the wafer,
link |
you need multiple layers
link |
of both active and passive components.
link |
Active, I mean superconducting electronic circuits
link |
that are performing computations,
link |
and passive, I mean these wave guides
link |
that are routing the optical signals to different places,
link |
you have to be able to stack those.
link |
If you can get to something like 10 planes
link |
of each of those, or maybe not even 10,
link |
maybe five, six, something like that,
link |
then you're in business.
link |
Now you can get millions of neurons on a wafer,
link |
but that's not anywhere close to the brain scale.
link |
In order to get to the scale of the human brain,
link |
you're gonna have to also use the third dimension
link |
in the sense that entire wafers
link |
need to be stacked on top of each other
link |
with fiber optic communication between them,
link |
and we need to be able to fill a space
link |
the size of this table with stacked wafers,
link |
and that's when you can get to some 10 billion neurons
link |
like your human brain,
link |
and I don't think that's specific
link |
to the optoelectronic approach that we're taking.
link |
I think that applies to any hardware
link |
where you're trying to reach commensurate scale
link |
and complexity as the human brain.
link |
So you need that fractal stacking,
link |
so stacking on the wafer,
link |
and stacking of the wafers,
link |
and then whatever the system that combines,
link |
this stacking of the tables with the wafers.
link |
And it has to be fractal all the way,
link |
you're exactly right,
link |
because that's the only way
link |
that you can efficiently get information
link |
from a small point to across that whole network.
link |
It has to have the power law connected.
link |
And photons are like optics throughout.
link |
Once you're at this scale, to me it's just obvious.
link |
Of course you're using light for communication.
link |
You have fiber optics given to us from nature, so simple.
link |
The thought of even trying to do
link |
any kind of electrical communication
link |
just doesn't make sense to me.
link |
I'm not saying it's wrong, I don't know,
link |
but that's where I'm coming from.
link |
So let's return to loop neurons.
link |
Why are they called loop neurons?
link |
Yeah, the term loop neurons comes from the fact,
link |
like we've been talking about,
link |
that they rely heavily on these superconducting loops.
link |
So even in a lot of forms of digital computing
link |
with superconductors,
link |
storing a signal in a superconducting loop
link |
is a primary technique.
link |
In this particular case,
link |
it's just loops everywhere you look.
link |
So the strength of a synaptic weight
link |
is gonna be set by the amount of current circulating
link |
in a loop that is coupled to the synapse.
link |
So memory is implemented as current circulating
link |
in a superconducting loop.
link |
The coupling between, say, a synapse and a dendrite
link |
or a synapse in the neuron cell body
link |
occurs through loop coupling through transformers.
link |
So current circulating in a synapse
link |
is gonna induce current in a different loop,
link |
a receiving loop in the neuron cell body.
link |
So since all of the computation is happening
link |
in these flux storage loops
link |
and they play such a central role
link |
in how the information is processed,
link |
how memories are formed, all that stuff,
link |
I didn't think too much about it,
link |
I just called them loop neurons
link |
because it rolls off the tongue a little bit better
link |
than superconducting optoelectronic neurons.
link |
Okay, so how do you design circuits for these loop neurons?
link |
That's a great question.
link |
There's a lot of different scales of design.
link |
So at the level of just one synapse,
link |
you can use conventional methods.
link |
They're not that complicated
link |
as far as superconducting electronics goes.
link |
It's just four Joseph's injunctions or something like that
link |
depending on how much complexity you wanna add.
link |
So you can just directly simulate each component in SPICE.
link |
It's Standard Electrical Simulation Software, basically.
link |
So you're just explicitly solving the differential equations
link |
that describe the circuit elements.
link |
And then you can stack these things together
link |
in that simulation software to then build circuits.
link |
You can, but that becomes computationally expensive.
link |
So one of the things when COVID hit,
link |
we knew we had to turn some attention
link |
to more things you can do at home in your basement
link |
or whatever, and one of them was computational modeling.
link |
So we started working on adapting,
link |
abstracting out the circuit performance
link |
so that you don't have to explicitly solve
link |
the circuit equations, which for Joseph's injunctions
link |
usually needs to be done on like a picosecond timescale
link |
and you have a lot of nodes in your circuit.
link |
So it results in a lot of differential equations
link |
that need to be solved simultaneously.
link |
We were looking for a way to simulate these circuits
link |
that is scalable up to networks of millions or so neurons
link |
is sort of where we're targeting right now.
link |
So we were able to analyze the behavior of these circuits.
link |
And as I said, it's based on these simple building blocks.
link |
So you really only need to understand
link |
this one building block.
link |
And if you get a good model of that, boom, it tiles.
link |
And you can change the parameters in there
link |
to get different behaviors and stuff,
link |
but it's all based on now it's one differential equation
link |
that you need to solve.
link |
So one differential equation for every synapse,
link |
dendrite or neuron in your system.
link |
And for the neuroscientists out there,
link |
it's just a simple leaky integrate and fire model,
link |
leaky integrator, basically.
link |
A synapse is a leaky integrator,
link |
a dendrite is a leaky integrator.
link |
So I'm really fascinated by how this one simple component
link |
can be used to achieve lots of different types
link |
of dynamical activity.
link |
And to me, that's where scalability comes from.
link |
And also complexity as well.
link |
Complexity is often characterized
link |
by relatively simple building blocks
link |
connected in potentially simple
link |
or sometimes complicated ways,
link |
and then emergent new behavior that was hard to predict
link |
from those simple elements.
link |
And that's exactly what we're working with here.
link |
So it's a very exciting platform,
link |
both from a modeling perspective
link |
and from a hardware manifestation perspective
link |
where we can hopefully start to have this test bed
link |
where we can explore things,
link |
not just related to neuroscience,
link |
but also related to other things
link |
that connected to other physics like critical phenomenon,
link |
Ising models, things like that.
link |
So you were asking how we simulate these circuits.
link |
It's at different levels
link |
and we've got the simple spice circuit stuff.
link |
That's no problem.
link |
And now we're building these network models
link |
based on this more efficient leaky integrator.
link |
So we can actually reduce every element
link |
to one differential equation.
link |
And then we can also step through it
link |
on a much coarser time grid.
link |
So it ends up being something like a factor
link |
of a thousand to 10,000 speed improvement,
link |
which allows us to simulate,
link |
but hopefully up to millions of neurons.
link |
Whereas before we would have been limited to tens,
link |
a hundred, something like that.
link |
And just like simulating quantum mechanical systems
link |
with a quantum computer.
link |
So the goal here is to understand such systems.
link |
For me, the goal is to study this
link |
as a scientific physical system.
link |
I'm not drawn towards turning this
link |
into an enterprise at this point.
link |
I feel short term applications
link |
that obviously make a lot of money
link |
is not necessarily a curiosity driver for you at the moment.
link |
If you're interested in short term making money,
link |
go with deep learning, use silicon microelectronics.
link |
If you wanna understand things like the physics
link |
of a fascinating system,
link |
or if you wanna understand something more
link |
along the lines of the physical limits
link |
of what can be achieved,
link |
then I think single photon communication,
link |
superconducting electronics is extremely exciting.
link |
What if I wanna use superconducting hardware
link |
at four Kelvin to mine Bitcoin?
link |
That's my main interest.
link |
The reason I wanted to talk to you today,
link |
I wanna say, no, I don't know.
link |
Look it up on the internet.
link |
Somebody told me about it.
link |
I'm not sure exactly what it is.
link |
But let me ask nevertheless
link |
about applications to machine learning.
link |
Okay, so if you look at the scale of five, 10, 20 years,
link |
is it possible to, before we understand the nature
link |
of human intelligence and general intelligence,
link |
do you think we'll start falling out of this exploration
link |
of neuromorphic systems ability to solve some
link |
of the problems that the machine learning systems
link |
of today can't solve?
link |
Well, I'm really hesitant to over promise.
link |
So I really don't know.
link |
Also, I don't really understand machine learning
link |
in a lot of senses.
link |
I mean, machine learning from my perspective appears
link |
to require that you know precisely what your input is
link |
and also what your goal is.
link |
You usually have some objective function
link |
or something like that.
link |
And that's very limiting.
link |
I mean, of course, a lot of times that's the case.
link |
There's a picture and there's a horse in it, so you're done.
link |
But that's not a very interesting problem.
link |
I think when I think about intelligence,
link |
it's almost defined by the ability to handle problems
link |
where you don't know what your inputs are going to be
link |
and you don't even necessarily know
link |
what you're trying to accomplish.
link |
I mean, I'm not sure what I'm trying to accomplish
link |
Yeah, at all scales.
link |
Yeah, at all scales, right.
link |
I mean, so I'm more drawn to the underlying phenomena,
link |
the critical dynamics of this system,
link |
trying to understand how elements that you build
link |
into your hardware result in emergent fascinating activity
link |
that was very difficult to predict, things like that.
link |
So, but I gotta be really careful
link |
because I think a lot of other people who,
link |
if they found themselves working on this project
link |
in my shoes, they would say, all right,
link |
what are all the different ways we can use this
link |
for machine learning?
link |
Actually, let me just definitely mention colleague
link |
at NIST, Mike Schneider.
link |
He's also very much interested,
link |
particularly in the superconducting side of things,
link |
using the incredible speed, power efficiency,
link |
also Ken Seagal at Colgate,
link |
other people working on specifically
link |
the superconducting side of this for machine learning
link |
and deep feed forward neural networks.
link |
There, the advantages are obvious.
link |
It's extremely fast.
link |
Yeah, so that's less on the nature of intelligences
link |
and more on various characteristics of this hardware
link |
that you can use for the basic computation
link |
as we know it today and communication.
link |
One of the things that Mike Schneider's working on right now
link |
is an image classifier at a relatively small scale.
link |
I think he's targeting that nine pixel problem
link |
where you can have three different characters
link |
and you put in a nine pixel image
link |
and you classify it as one of these three categories.
link |
And that's gonna be really interesting
link |
to see what happens there,
link |
because if you can show that even at that scale,
link |
you just put these images in and you get it out
link |
and he thinks he can do it,
link |
I forgot if it's a nanosecond
link |
or some extremely fast classification time,
link |
it's probably less,
link |
it's probably a hundred picoseconds or something.
link |
There you have challenges though,
link |
because the Joseph's injunctions themselves,
link |
the electronic circuit is extremely power efficient.
link |
Some orders of magnitude for something more
link |
than a transistor doing the same thing,
link |
but when you have to cool it down to four Kelvin,
link |
you pay a huge overhead just for keeping it cold,
link |
even if it's not doing anything.
link |
So it has to work at large scale
link |
in order to overcome that power penalty,
link |
but that's possible.
link |
It's just, it's gonna have to get that performance.
link |
And this is sort of what you were asking about before
link |
is like how much better than silicon would it need to be?
link |
And the answer is, I don't know.
link |
I think if it's just overall better than silicon
link |
at a problem that a lot of people care about,
link |
maybe it's image classification,
link |
maybe it's facial recognition,
link |
maybe it's monitoring credit transactions, I don't know,
link |
then I think it will have a place.
link |
It's not gonna be in your cell phone,
link |
but it could be in your data center.
link |
So what about in terms of the data center,
link |
I don't know if you're paying attention
link |
to the various systems,
link |
like Tesla recently announced DOJO,
link |
which is a large scale machine learning training system,
link |
that again, the bottleneck there
link |
is probably going to be communication
link |
between those systems.
link |
Is there something from your work
link |
on everything we've been talking about
link |
in terms of superconductive hardware
link |
that could be useful there?
link |
Oh, I mean, okay, tomorrow, no.
link |
In the long term, it could be the whole thing.
link |
It could be nothing.
link |
I don't know, but definitely, definitely.
link |
When you look at the,
link |
so I don't know that much about DOJO.
link |
My understanding is that that's new, right?
link |
That's just coming online.
link |
Well, I don't even know where it hasn't come online.
link |
And when you announce big, sexy,
link |
so let me explain to you the way things work
link |
in the world of business and marketing.
link |
It's not always clear where you are
link |
on the coming online part of that.
link |
So I don't know where they are exactly,
link |
but the vision is from a ground up
link |
to build a very, very large scale,
link |
modular machine learning, ASIC,
link |
basically hardware that's optimized
link |
for training neural networks.
link |
And of course, there's a lot of companies
link |
that are small and big working on this kind of problem.
link |
The question is how to do it in a modular way
link |
that has very fast communication.
link |
The interesting aspect of Tesla is you have a company
link |
that at least at this time is so singularly focused
link |
on solving a particular machine learning problem
link |
and is making obviously a lot of money doing so
link |
because the machine learning problem
link |
happens to be involved with autonomous driving.
link |
So you have a system that's driven by an application.
link |
And that's really interesting because you have maybe Google
link |
working on TPUs and so on.
link |
You have all these other companies with ASICs.
link |
They're usually more kind of always thinking general.
link |
So I like it when it's driven by a particular application
link |
because then you can really get to the,
link |
it's somehow if you just talk broadly about intelligence,
link |
you may not always get to the right solutions.
link |
It's nice to couple that sometimes
link |
with specific clear illustration
link |
of something that requires general intelligence,
link |
which for me driving is one such case.
link |
I think you're exactly right.
link |
Sometimes just having that focus on that application
link |
brings a lot of people focuses their energy and attention.
link |
I think that, so one of the things that's appealing
link |
about what you're saying is not just
link |
that the application is specific,
link |
but also that the scale is big
link |
and that the benefit is also huge.
link |
Financial and to humanity.
link |
Right, right, right.
link |
Yeah, so I guess let me just try to understand
link |
is the point of this dojo system
link |
to figure out the parameters
link |
that then plug into neural networks
link |
and then you don't need to retrain,
link |
you just make copies of a certain chip
link |
that has all the other parameters established or?
link |
No, it's straight up retraining a large neural network
link |
over and over and over.
link |
So you have to do it once for every new car?
link |
No, no, you have to, so they do this interesting process,
link |
which I think is a process for machine learning,
link |
supervised machine learning systems
link |
you're going to have to do, which is you have a system,
link |
you train your network once, it takes a long time.
link |
I don't know how long, but maybe a week.
link |
And then you deploy it on, let's say about a million cars.
link |
I don't know what the number is.
link |
But that part, you just write software
link |
that updates some weights in a table and yeah, okay.
link |
But there's a loop back.
link |
Each of those cars run into trouble, rarely,
link |
but they catch the edge cases
link |
of the performance of that particular system
link |
and then send that data back
link |
and either automatically or by humans,
link |
that weird edge case data is annotated
link |
and then the network has to become smart enough
link |
to now be able to perform in those edge cases,
link |
so it has to get retrained.
link |
There's clever ways of retraining different parts
link |
of that network, but for the most part,
link |
I think they prefer to retrain the entire thing.
link |
So you have this giant monster
link |
that kind of has to be retrained regularly.
link |
I think the vision with Dojo is to have
link |
a very large machine learning focused,
link |
driving focused supercomputer
link |
that then is sufficiently modular
link |
that can be scaled to other machine learning applications.
link |
So they're not limiting themselves completely
link |
to this particular application,
link |
but this application is the way they kind of test
link |
this iterative process of machine learning
link |
is you make a system that's very dumb,
link |
deploy it, get the edge cases where it fails,
link |
make it a little smarter, it becomes a little less dumb
link |
and that iterative process achieves something
link |
that you can call intelligent or is smart enough
link |
to be able to solve this particular application.
link |
So it has to do with training neural networks fast
link |
and training neural networks that are large.
link |
But also based on an extraordinary amount of diverse input.
link |
And that's one of the things,
link |
so this does seem like one of those spaces
link |
where the scale of superconducting optoelectronics,
link |
the way that, so when you talk about the weaknesses,
link |
like I said, okay, well, you have to cool it down.
link |
At this scale, that's fine.
link |
Because that's not too much of an added cost.
link |
Most of your power is being dissipated
link |
by the circuits themselves, not the cooling.
link |
And also you have one centralized kind of cognitive hub,
link |
And so if we're talking about putting
link |
a superconducting system in a car, that's questionable.
link |
Do you really wanna cryostat
link |
in the trunk of everyone in your car?
link |
It'll fit, it's not that big of a deal,
link |
but hopefully there's a better way, right?
link |
But since this is sort of a central supreme intelligence
link |
or something like that,
link |
and it needs to really have this massive data acquisition,
link |
massive data integration,
link |
I would think that that's where large scale
link |
spiking neural networks with vast communication
link |
and all these things would have something
link |
pretty tremendous to offer.
link |
It's not gonna happen tomorrow.
link |
There's a lot of development that needs to be done.
link |
But we have to be patient with self driving cars
link |
for a lot of reasons.
link |
We were all optimistic that they would be here by now.
link |
And okay, they are to some extent,
link |
but if we're thinking five or 10 years down the line,
link |
it's not unreasonable.
link |
One other thing, let me just mention,
link |
getting into self driving cars and technologies
link |
that are using AI out in the world,
link |
this is something NIST cares a lot about.
link |
Elham Tabassi is leading up a much larger effort in AI
link |
at NIST than my little project.
link |
And really central to that mission
link |
is this concept of trustworthiness.
link |
So when you're going to deploy this neural network
link |
in every single automobile with so much on the line,
link |
you have to be able to trust that.
link |
So now how do we know that we can trust that?
link |
How do we know that we can trust the self driving car
link |
or the supercomputer that trained it?
link |
There's a lot of work there
link |
and there's a lot of that going on at NIST.
link |
And it's still early days.
link |
I mean, you're familiar with the problem and all that.
link |
But there's a fascinating dance in engineering
link |
with safety critical systems.
link |
There's a desire in computer science,
link |
just recently talked to Don Knuth,
link |
for algorithms and for systems,
link |
for them to be provably correct or provably safe.
link |
And this is one other difference
link |
between humans and biological systems
link |
is we're not provably anything.
link |
And so there's some aspect of imperfection
link |
that we need to have built in,
link |
like robustness to imperfection be part of our systems,
link |
which is a difficult thing for engineers to contend with.
link |
They're very uncomfortable with the idea
link |
that you have to be okay with failure
link |
and almost engineer failure into the system.
link |
Mathematicians hate it too.
link |
But I think it was Turing who said something
link |
along the lines of,
link |
I can give you an intelligent system
link |
or I can give you a flawless system,
link |
but I can't give you both.
link |
And it's in sort of creativity and abstract thinking
link |
seem to rely somewhat on stochasticity
link |
and not having components
link |
that perform exactly the same way every time.
link |
This is where like the disagreement I have with,
link |
not disagreement, but a different view on the world.
link |
but when I talk to robotic, robot colleagues,
link |
that sounds like I'm talking to robots,
link |
colleagues that are roboticists,
link |
the goal is perfection.
link |
And to me is like, no,
link |
I think the goal should be imperfection
link |
that's communicated.
link |
And through the interaction between humans and robots,
link |
that imperfection becomes a feature, not a bug.
link |
Like together, seen as a system,
link |
the human and the robot together
link |
are better than either of them individually,
link |
but the robot itself is not perfect in any way.
link |
Of course, there's a bunch of disagreements,
link |
including with Mr. Elon about,
link |
to me, autonomous driving is fundamentally
link |
a human robot interaction problem,
link |
not a robotics problem.
link |
To Elon, it's a robotics problem.
link |
That's actually an open and fascinating question,
link |
whether humans can be removed from the loop completely.
link |
We've talked about a lot of fascinating chemistry
link |
and physics and engineering,
link |
and we're always running up against this issue
link |
that nature seems to dictate what's easy and what's hard.
link |
So you have this cool little paper
link |
that I'd love to just ask you about.
link |
Does Cosmological Evolution Select for Technology?
link |
So in physics, there's parameters
link |
that seem to define the way our universe works,
link |
that physics works, that if it worked any differently,
link |
we would get a very different world.
link |
So it seems like the parameters are very fine tuned
link |
to the kind of physics that we see.
link |
All the beautiful E equals MC squared,
link |
they would get these nice, beautiful laws.
link |
It seems like very fine tuned for that.
link |
So what you argue in this article
link |
is it may be that the universe has also fine tuned
link |
its parameters that enable the kind of technological
link |
innovation that we see, the technology that we see.
link |
Can you explain this idea?
link |
Yeah, I think you've introduced it nicely.
link |
Let me just try to say a few things in my language layout.
link |
What is this fine tuning problem?
link |
So physicists have spent centuries trying to understand
link |
the system of equations that govern the way nature behaves,
link |
the way particles move and interact with each other.
link |
And as that understanding has become more clear over time,
link |
it became sort of evident that it's all well adjusted
link |
to allow a universe like we see, very complex,
link |
this large, long lived universe.
link |
And so one answer to that is, well, of course it is
link |
because we wouldn't be here otherwise.
link |
But I don't know, that's not very satisfying.
link |
That's sort of, that's what's known
link |
as the weak anthropic principle.
link |
It's a statement of selection bias.