There is an exponentially growing gap between our appetite for computation and what Moore’s Law can deliver. Our civilization is based on computation — we must find technologies that transcend the current limitations of architecture and imagination.
While not an actual law, Gordon Moore’s 1965 observation that the number of components on a single silicon chip was doubling roughly every two years has held for decades. All exponentials in the physical world must come to an end, and despite the incredible achievements of semiconductor manufacturing Moore’s Law is no exception.
There was a time when single core CPU performance was doubling every year and a half. Now they’re doubling every 20 years and we have needed to resort to multicore approaches to improve overall performance. This has worked for the last 15 years but at some point you run into a different law — Amdahl’s Law — which reduces the efficiency of each incremental core.
Clock speeds have not increased in 15 years. Dennard (1974) observed that voltage and current should be proportional to the linear dimensions of a transistor. This held sway until about 2006 when leakage current from ever smaller transistors created a power wall that prevented CPUs from practically operating above 4GHz.
We are also approaching the lower physical limits of feature sizes in silicon and are near a regime where quantum effects dominate, rendering classical transistors useless.
At the same time, the rise of deep learning is creating a hyperexponential demand for compute with gigantic models that are growing at an eye-watering rate. Moore’s Law has traditionally been thought of doubling every two years. We have a class of machine learning models that are increasing by a factor of 15x every two years, and more recently, transformer networks are driving that demand at 750x every two years.
So what are the paths forward? What new approaches and architectures will allow us to overcome these barriers?
You may have heard of the Von Neumann bottleneck, which is a limitation in the way our current CPU architectures work. We’ve been doing it this way for 80 years. These CPUs are breathing through a straw, serially pulling instructions one at a time out of memory. Most improvements in memory have been in capacity rather than speed and subsequently modern CPUs tend to be log-jammed waiting to fetch the next instruction.
Many people have observed that the Von Neumann bottleneck is not only physical but intellectual: a failure of imagination where we’re stuck in that old paradigm of how computing is meant to work. There are other intriguing possibilities that take a different path.
A related problem is the “memory wall,” which is the growing disparity of speed between the core computing elements outside memory. Data movement is now the dominant source of power consumption in many devices and is nearly always the limiting factor in deep learning systems.
The principle algorithm behind AI — the backpropagation algorithm — is also now about 40 years old. Recent remarkable improvements in deep learning performance come from brute force rather than the art of the algorithm. But this profligate growth of models is unsustainable: our largest models already cost tens of millions of dollars to train.
Using electrons to compute isn’t ideal, either. They generate a lot of heat when they move through wires. They’re physical things that have been convenient to engineer to this point, but they have real limitations in terms of scale and transmission over long distances.
All of this arguably comes from the nature of quantum physics, which is both a limitation and an opportunity. The reason we can’t make transistors too much smaller is that once you get down to handfuls of atoms building a gate in a transistor, you’re not really sure where the electron is because of quantum uncertainty — and there’s no way around that in a traditional computer architecture.
Not only can we use quantum physics to our advantage to build new kinds of computation, but we can use those machines to do computations on the quantum nature of the world in order to build entirely new technologies and materials.
The Paths Forward
NextSilicon has pioneered a post-Von Neumann technology that produces what they call “mill cores,” which you could think of as software automatically generating custom silicon to run existing software in an entirely new way. This approach allows them to get a 10x to 100x performance advantage over CPUs on some classes of computation.
For the memory wall, why not perform computation in the memory itself? D-Matrix has built a technology that blends the operations for deep learning with the storage in the same part of the silicon — arithmetic operations happen in situ, avoiding data movement.
MosaicML is approaching machine learning with the art of the algorithm to apply state of the art learnings in advances in AI to greatly reduce the power required to achieve a certain level of fidelity, and then operationalizing in a way that the Fortune 50 can actually buy it, and use it in a sensible, scalable way.
Ayar Labs use light and photons instead of electrons and allows you to deeply embed high-performance photonic interconnects directly onto the silicon. This also allows architectures to scale in ways that have never been possible with copper.
PsiQuantum is building a machine that not only uses photons for computation, but can unlock technologies and materials that will inevitably accelerate the ways we build computation in the future.
Moving Beyond
This is by no means an exhaustive list of meaningful investments that will transform computation, but these companies will have a central role in moving us beyond Moore’s Law.