Saturday, May 18, 2024

Power and computing

The Wall Street Journal last week had an article (sorry about the paywall) titled "There’s Not Enough Power for America’s High-Tech Ambitions", about how there is enormous demand for more data centers (think Amazon Web Services and the like), and electricity production can't readily keep up.  I've written about this before, and this is part of the motivation for programs like FuSE (NSF's Future of Semiconductors call).  It seems that we are going to be faced with a choice: slow down the growth of computing demand (which seems unlikely, particularly with the rise of AI-related computing, to say nothing of cryptocurrencies); develop massive new electrical generating capacity (much as I like nuclear power, it's hard for me to believe that small modular reactors will really be installed at scale at data centers); or develop approaches to computing that are far more energy efficient; or some combination.  

The standard computing architecture that's been employed since the 1940s is attributed to von Neumann.  Binary numbers (1, 0) are represented by two different voltage levels (say some \(V\) for a 1 and \(V \approx 0\) for a 0); memory functions and logical operations happen in two different places (e.g., your DRAM and your CPU), with information shuttled back and forth as needed.  The key ingredient in conventional computers is the field-effect transistor (FET), a voltage-activated switch, in which a third (gate) electrode can switch the current flow between a source electrode and a drain electrode.  

The idea that we should try to lower power consumption of computing hardware is far from new.  Indeed, NSF ran a science and technology center for a decade at Berkeley about exploring more energy-efficient approaches.  The simplest approach, as Moore's Law cooked along in the 1970s, 80s, and 90s, was to steadily try to reduce the magnitude of the operating voltages on chips.  Very roughly speaking, power consumption goes as \(V^{2}\).  The losses in the wiring and transistors scale like \(I \cdot V\); the losses in the capacitors that are parts of the transistors scale like some fraction of the stored energy, which is also like \(V^{2}\).  For FETs to still work, one wants to keep the same amount of gated charge density when switching, meaning that the capacitance per area has to stay the same, so dropping \(V\) means reducing the thickness of the gate dielectric layer.  This went on for a while with SiO2 as the insulator, and eventually in the early 2000s the switch was made to a higher dielectric constant material because SiO2 could not be made any thinner.  Since the 1970s, the operating voltage \(V\) has fallen from 5 V to around 1 V.  There are also clever schemes now to try to vary the voltage dynamically.  For example, one might be willing to live with higher error rates in the least significant bits of some calculations (like video or audio playback) if it means lower power consumption.  With conventional architectures, voltage scaling has been taken about as far as it can go.

Way back in 2006, I went to a conference and Eli Yablonovitch talked at me over dinner about how we needed to be thinking about far lower voltage operations.  Basically, his argument was that if we are using voltages that are far greater than the thermal voltage noise in our wires and devices, we are wasting energy.  With conventional transistors, though, we're kind of stuck because of issues like subthreshold swing.  

So what are the options?  There are many ideas out there. 
  • Change materials.  There are materials that have metal-insulator transitions, for example, such that it might be possible to trigger dramatic changes in conduction (for switching purposes) with small stimuli, evading the device physics responsible for the subthreshold slope argument.  
  • Change architectures.  Having memory and logic physically separated isn't the only way to do digital computing.  The idea of "logic-in-memory" computing goes back to before I was born.  
  • Radically change architectures.  As I've written before, there is great interest in neuromorphic computing, trying to make devices with connectivity and function designed to mimic the way neurons work in biological brains.  This would likely mean analog rather than digital logic and memory, complex history-dependent responses, and trying to get vastly improved connectivity.  As was published last week in Science, 1 cubic millimeter of brain tissue contains 57,000 cells and 150,000,000 synapses.  Trying to duplicate that level of 3D integration at scale is going to be very hard.  The approach of just making something that starts with crazy but uncontrolled connectivity and training it somehow (e.g., this idea from 2002) may reappear.
  • Update: A user on twitter pointed out that the time may finally be right for superconducting electronics.  Here is a recent article in IEEE Spectrum about this, and here is a youtube video of a pretty good intro.  The technology of interest is "rapid single-flux quantum" (RSFQ) logic, where information is stored in circulating current loops in devices based on Josephson junctions.  The compelling aspects include intrinsically ultralow power dissipation b/c of superconductivity, and intrinsically fast timescales (clock speeds of hundreds of GHz) because of the frequency scales associated with the Josephson effect.  I'm a bit skeptical, because these ideas have been around for 30+ years and the integration challenges are still significant, but maybe now the economic motivation is finally sufficient.
A huge driving constraint on everything is economics.  We are not going to decide that computing is so important that we will sacrifice refrigeration, for example; basic societal needs will limit what fraction of total generating capacity we devote to computing, and that includes concerns about impact of power generation on climate.  Likewise, switching materials or architectures is going to be very expensive at least initially, and is unlikely to be quick.  It will be interesting to see where we are in another decade.... 

10 comments:

Anonymous said...

The beauty of computation is that you can do it at locations with cheap and abundant power. E.g. solar or hydro. And then ship the data to where it's needed.
We already have the network for that...

So it doesn't need to be here.

Pizza Perusing Physicist said...

This post makes me realize that I don’t think you’ve ever written a post about the thermodynamics of information processing and computation (Szilard engine, the Landauer limit, Maxwell’s demon, etc…). A topic to consider for a future post? I think it’s more relevant today than ever, given these energy consumption issues in the age of big data.

Anonymous said...

Do you think high Tc superconductors will get mature enough to lend themselves to being used as interconnects on chips. I’d imagine that 80K-100K Tc is good enough to allow for the use of liquid nitrogen, which is feasible for use in data centers.

Peter Armitage said...

It seems that such power needs is going to be the great technical challenge of our time. I was listening to a a podcast some months ago with someone like Sam Altman and he was forecasting the exponential growth that they were going to have with Open AI and what he expected was going to happen. The interviewer did the simple extrapolation of the power needed and it was something like in 5 years AI would require x0% of our power capacity. The interviewer was like "... uhh...what's that gonna mean?". And Altman basically refused to engage. He just moved onto the next topic... Is it because he is really ignoring it? Or is because they know that electricity is going to be so expensive that normal people won't be able AC their houses? Or is there a secret plan?

Irrespective of however you feel about AI, all the forecasted big changes are basically impossible without a revolution in the power needs of computing.

Moreover, there seems no current honest way out of this except burning more fossil fuels, which is probably untenable at the scales needed. That Georgia utility's planning seemed totally unserious. "1) Buy more power from Florida. 2) Get more batteries." :)

Douglas Natelson said...

Anon@7:14, true, power is portable, but we do lose something like 20% of the generated power in transmission because of resistive losses in transmission lines. It makes a ton of sense to have power sources for data centers near the data centers if possible.

PPP, good idea, though I will need to refresh myself on the topic. I'd read a bit about this years ago, but beyond very basic concepts (irreversible computation costs kT ln 2 per bit), I didn't retain as much as I'd've liked.

Anon@12:30, I was about to update this post to reflect a point sent to me on twitter re superconducting electronics. Unfortunately, I am not optimistic at all about integration of high-Tc interconnects with Si in the near term. The materials are very challenging in terms of chemical compatibility and processing stability. Regularly making few hundred nm width high Tc wires lithographically and preserving their good properties is hard even when starting from beautiful epitaxial films. Heterogeneous integration with Si will be difficult.

Peter, I agree. This is what I meant about economics. There is just no way that people will decide to sacrifice, e.g., refrigeration or lighting so that OpenAI can run GPT 7 at scale. I'm not convinced that fossil fuels are an answer either - even building that much generating capacity via gas-fire power plants would be hard to achieve. It seems more likely to me that we will come up with new architectures or the pace of computational power consumption will end up leveling off b/c of hard economic choices about power. (Now, if you and I could just get that whole Iron Man arc reactor technology working, we could have a different conversation....)

Anonymous said...

My point was that power losses are too large and that instead data transportation is lower cost. Compute where power is available and move data to where they are needed.

Anonymous said...

Prof. Natelson, are you aware of any high Tc superconducting films that can work when amorphous and deposited by techniques like ALD and CVD?

Steve said...

Hi Doug,

These are all good topics for academic research, but please be aware that none of the options you propose will likely make it to high-volume chip production by 2040. The closest one in your list is the "architecture change", but "logic in memory" is still too disruptive given the way our industry works.

There are lower-hanging fruit out there, and the industry still has a roadmap to "100x" power reduction, with only about 2x coming from the process; it's improvements in packaging, and more evolutionary architectural changes (bringing memory closer to logic, optical interconnects, etc). Algorithmic efficiency will play a huge role as well, given the rapid progress in AI.

Please see the bar charts shown by Lisa Su at the ISSCC conference last year:

https://www.hpcwire.com/2023/02/21/a-zettascale-computer-today-would-need-21-nuclear-power-plants/

Douglas Natelson said...

Steve, thanks for that information. Thanks to you I realized that I left out a critical sentence from my post. I had wanted to emphasize that no major paradigmatic shifting fix is going to be quick. Building huge power capacity is obviously not fast. People also need to understand why we are not in the post-Si world: it will take decades to prove out large scale heterogeneous integration of new materials w silicon. In optoelectronics, there has been real progress, and phase change materials show up in memory schemes, but not at all on logic as far as I can tell. Thanks for the link!

Steve said...

Hi Doug,

I'm glad I could contribute to the thread. Given the time it will take to move on from our Si/CMOS paradigm, this means that we will hit the predicted energy "crisis" earlier than that. Even if Lisa Su and her peers are right and we improve power efficiency by a factor 100x, society's constant hunger for compute means that we will hit the energy ceiling anyway, unless, of course, we are limited by fab capacity. Fascinating times...