By Ed Sperling
So far most of the energy savings in SoCs have been achieved using two main approaches—turning off most of the chip most of the time, and changing the materials used to insulate against current leakage.
Over the next few years, changes to designs will be more radical, encompass more pieces of a bigger system, and they will be orders of magnitude more effective. From a market standpoint, there is little choice. Computing increasingly is going mobile, and time between charges is a competitive edge. The caveat is that increased battery life has to come with a subsequent increase in functionality. Everything that could be done with a plug now will have to be done without one.
That means rethinking everything from the hardware design to the usage model to the software that runs on those platforms. And it means getting chips out the door at least as quickly, if not more quickly. Here are five trends and approaches that collectively, and sometimes individually, will have a big impact on energy efficiency, power consumption and leakage:
1. Rethinking the basics. Some of the biggest advances in efficiency will come from optimizing existing technology. There is more to turn off, more pieces to improve, and there are more ways of doing it better.
Consider something as basic as the clock, for example. The big focus has been maximizing frequency for nearly five decades. There are even concurrent clocks to make that happen. But having them always on and always running at the same frequency means they use a lot more energy than necessary.
“Design has always centered around the clock being the heartbeat of the system,” said Chi-Ping Su, senior vice president of R&D for Cadence’s Silicon Realization Group. “So people always assume the clock will be on. What we have found, working with ARM and the processor type of design, is that the clock consumes an extremely large percentage of the power. Timing and frequency are based on the clock. So you build a tree to be the ideal clock and you do everything based on that. When we started looking at it, we started asking why clocks need to be balanced at all.”
So how much energy can be saved? Su contends the amount is up to 30% of clock-tree power and up to 50% of dynamic power for the entire system.
He’s not alone in touting these kinds of numbers. Most SoC tools developers believe that dealing with energy/power/leakage at or before RTL can mean significant savings for the overall design.
“All the low-hanging fruit is still available to chip designers,” said Vic Kulkarni, senior vice president and general manager at Apache Design. “We find that even advanced designers are more concerned with meeting functionality and identifying power bugs. What they forget is the relationship between data, clock, reset and enable—the four signals in an SoC.”
2. Reducing distance and resistance. Over the next two years the SoC industry will undergo a radical shift that will continue for years to come. Rather than plotting Moore’s Law linearly, transistors will be placed in three dimensions.
Driven partly by re-use, partly by time-to-market pressures and partly by physical limitations, 2.5D and 3D stacking will have an enormous effect on energy consumption and power. By stacking memory and other components on top of logic, the distance a signal must travel can be shortened significantly, along with the energy necessary to drive that signal.
“Moore’s Law is not a law,” said Wally Rhines, chairman and CEO of Mentor Graphics. “But the easiest way to reduce the cost of a transistor for the last 40 years has been shrinking feature sizes and growing wafer sizes. We are coming into an era where it will be more cost effective to stack die than to shrink feature sizes. We will hit it with memory before logic, but as with all new technologies we will adopt it before it is cost effective because of unique capabilities.”
Whether it’s done with an interposer, package-on-package, or flip-chip bumped die, Rhines said there is a 70% decrease in power dissipation if the memory can be put on top of a processor.
And that’s just for starters. By adding more processors that are sized for a particular function and tying that to just the right amount of memory, rather than a whole memory chip or block, far less power is needed. Companies such as Tensilica and ARM have been making this case for some time. With stacked die, their arguments are likely to receive far more attention.
3. New materials and structures. Calling a material “new” is something of a misnomer in SoC design. Most of the techniques that we consider revolutionary have been around for decades, but they haven’t been developed enough to the point where they are cost effective, both from a yield and materials standpoint.
Through-silicon VIAs, for example, have been talked about since the late 1950s, and interposers in 2.5D packages are simply a collection of TSVs on a single die. But there are still issues to be worked out. Shang-Yi Chiang, senior vice president of R&D at TSMC, said there questions remain about how to integrate a substrate with an interposer, and how to debug it at different phases of development so it can be tested.
“There are a lot of parasitics to deal with in 2.5D,” Chiang said. “And with 3D we need time to make sure we can calibrate it.”
The other kind of 3D—structures such as FinFETs, tunnel FETs and nanowires—have been on the drawing board since the 1990s. All of these structures can lower leakage by controlling the gate at multiple points. FinFETs are planned in volume for 14nm by both GlobalFoundries and TSMC, while Intel may begin using them as early as 22nm.
These structures hold the promise of radically reducing leakage of both static and dynamic power using all modes of operation—at least initially.
“The problem is these are a one-off thing,” said Mike Muller, chief technology officer at ARM. “FinFETs do reduce leakage, but once you’ve done that you’ve still got three impossible things to do before breakfast. Those kinds of steps are part of the solution.”
Muller said combining those with stacking techniques will go even further. “It opens the door to completely different die-to-die memory interfaces which allow you to build more efficient systems than when you go off the chip, down the serial interface to a separately packaged die. It changes the memory bandwidth, and this is just a computer at the end of the day so memory is one of the fundamentals for performance. Stacking allows you to change that.
4. Lowering the voltage. One of the benefits of 3D structures such as FinFETs and stacking of die is that they make it easier to lower the voltage in certain parts of the chip. The reason is that the minimum voltage for DRAM may be higher just to maintain functionality than it is for logic or I/O. By separating those functions into different die, issues such as state retention and leakage can be confined and dealt with independently—the so-called divide-and-conquer approach.
So how low can the voltage go? Several years ago, researchers at IBM said the minimum voltage for an SoC would be at least 0.7 volts. It now appears it can be as low as 0.1 or 0.2 volts, and research is under way to lower it even further.
“You can get down to 0.3 or 0.2 volts without any problems,” Qi Wang, technical marketing group director at Cadence, said during a recent roundtable. “If you keep the aspect ratio of the depth and the height of a FinFET then you can guarantee the performance, but you do have other physical effects. Nothing is free. But the voltage can go much lower than what the textbooks say.”
5. Fixing software. Software is the last piece of the puzzle to fix, and it’s been one of the hardest for a number of reasons.
First of all, software takes longer to create and perfect than hardware. This is evident in all the bug fixes and updates. All three of the top EDA players are involved in this effort. Synopsys is working on software prototyping to get allow software to be written even before the hardware is ready. Mentor has been involved in simplifying the creation of RTOSes and embedded software. And Cadence has shifted its design approach so that software and hardware can be done far more concurrently.
But getting software out on time is only a first step. The next step is to make software function more efficiently, an approach that dates back to the RISC vs. CISC wars of the 1990s. Reduced instruction set computing was more efficient than complex instruction set computing, which boosted performance. By taking that approach one step further, it also can reduce the amount of energy consumed by a particular task, and be used to manage the overall power in an system much more efficiently.
Work on symmetric multiprocessing continues, as well. How far that will go is anyone’s guess, but for most applications we now seem to be facing a limit on the number of cores that can be effectively used by most applications. Talk about unlimited number of cores has given way to limited numbers of cores and unlimited numbers of processors spread throughout a system—most of which are off most of the time.
Taken together, all five of these trends will have a huge effect on efficiency, power and leakage. And now that battery life is a competitive issue, it also is likely to be used by vendors and seen as a value add instead of an unnecessary engineering cost—or worse, a nuisance.