Posts Tagged ‘FinFET’

The Power Game

Thursday, April 11th, 2013

By Ann Steffora Mutschler

Semiconductor engineering teams always have focused on stepping up performance in new designs, but in the mobile, GPU and tablet markets they’re finding that maintaining the balance between higher performance and the same or lower power is increasingly onerous. The reason: Extreme gaming applications can create scenario files that cause dynamic power consumption to spike out of bounds.

“Most people today talk about performance per watt, said William Ruby, senior director of RTL power product engineering at Apache Design. “This is really the metric. Keeping performance the same is probably not an option because even mobile applications, if you look at today’s phones, they are quad-core going to 8-core and even to 64-core, so performance is not going to be the same. Power is not going to be the same either. Power most likely will go up while performance/watt better be rising overall.”

So how do engineering teams deal with that?

“Technically, there is really only one answer to this, which is architecture,” Ruby said. “You’ve got to start thinking about what functions are enabled and when—basically doing the things that are absolutely necessary to do. In some cases that means making performance tradeoffs.”

There appears to be a growing consensus for that approach. “Everybody is trying to reduce the power—the software engineer, the hardware engineer—but the power reduction can be done at a much higher level,” said Solaiman Rahim, senior technology director for power products at Atrenta.

To help engineering teams make decisions about how to reduce power at the architectural level, RCP (remote procedure call) designs can be utilized, Rahim said. RCP is a protocol that helps to execute software in the network in different machines. “By looking at the activity data for the chip and doing some power exploration of the chip, we could identify the reason for the peak power in the design. The RCP usually has a lot of processing elements so there is a lot of processing going on, which usually causes a lot of power consumption. There are a few signals in the design that control those processing elements, so by providing this information by using our exploration capability and analyzing the activity data, we could provide this information to the architect and the RTL designer. Then they can use this information to do some hardware changes, such as gating the processing elements when it is not used, and also provide this information for the software engineer.”

Through this data, the software engineer knows exactly what in the design causes those peaks of power, and by using a cache mechanism in the software, the signals could be exercised less, resulting in power reduction, Rahim said.

However, this methodology is not widespread today because to do power exploration, simulation vectors are needed.  “When you are at the architectural level or software level you might not get to the stage where you have those vectors,” he noted. “But also it depends on the quality of those vectors and having the power exploration tools that analyzes the RTL and provide some guidance in the architecture—something that is not widespread.”

Intense challenges in extreme gaming GPUs

In the extreme gaming space, Vic Kulkarni, general manager and senior vice president of the RTL Business Unit at Apache Design, observed that what Japanese and U.S. semiconductor companies have been facing in the last few months when they are doing extreme gaming—things like shader cores—are now a very important part of a GPU.

“How do you manage the shadows and also sunlight, smoke and fire,” asked Kulkarni. “Those are the most difficult rendering in GPUs. In extreme gaming, a lot of things are changing real time and the user is plugging away, moving all kinds of joystick controls. How do you move the shadows and the sunlight and then have long shadows, short shadows, almost in real time? Tremendous GPU processing is happening during the extreme gaming when scenes and frames are changing so rapidly. What this means is the FSDB size and the scenario files are getting huge, and that’s where the technical challenge comes in. For that split microsecond or even nanosecond of simulation time, FSDBs have to be processed, and without jitter. Smooth rendering is the technical challenge the GPU companies are facing.:

One way to handle this would be the critical signal selection of large FSDBs followed by power profiling.

“Not only is it high-performance processing, but it definitely sucks the power,” said Pete Hardee, low-power design solution marketing director at Cadence. “ Obviously the gaming experience that gamers want now is every bit as good from the graphical rendering point of view on a tablet as it was just a very short time ago on a high-end, gaming-enabled desktop PC. What’s special about the way GPUs work in gaming applications, and in fact in any HD video, is that the power consumption is dependent on the number of frames being processed as well as the content of those frames. The rendering and shadows and smoke and fire—all of those things that bring the realism to the picture—is pretty extreme because it’s so fast moving and there’s stuff happening all over the screen.”

All of this adds up to each frame being extremely busy in terms of what changed from the last frame, which is why the GPUs are structured to have multiple rendering engines. Those engines switch in and out, depending on the processing need, while the overall frame rate remains constant. But the amount of processing going on changes depending on those content issues, Hardee explained. As such, it’s extremely difficult to estimate power with any degree of accuracy.

The only way to adequately represent the design in a scenario is through emulation, which is why GPU companies such as Nvidia have set up massive in-house emulation labs.

“You need accurate characterization, but you also need to process a lot of realistic activity actual picture, actual video frames, and emulation is the only way to do that,” Hardee noted.

Software progress

Software has come a long way in making all of this work efficiently, too.

“Everything is always getting more complex, and so we always adapt to change. As people want more and more functionality that just means the designs get bigger and we know that the tools do more, run faster and have better capacity than they did even 10 years ago,” pointed out Mary Ann White, director of Galaxy Implementation Platform Marketing at Synopsys.

The chart below shows people have to adapt to save power.

This chart illustrate design size and performance targets based on data from mobile applications respondents to Synopsys’ Global User Survey. The data shows that while designs are getting bigger and faster, the power budgets remain essentially the same. The low-power techniques are used for those nodes because they help compensate for leakage at the smaller nodes. (Source: Synopsys)

“From 90 nm on it’s super leaky so they have to adapt to using new low-power techniques that they didn’t before,” said White. “There’s no more ‘one button to push’ that says can you give me the best performance and best power. It doesn’t work that way anymore. Multi-corner multi-mode (MCMM) might not have been something that was commonly in use 5 to 10 years ago, but now because everything has different modes— you’ve got DVFS where you are dynamically changing voltage and frequency to be able to save on power—and so you want performance when you can have it. The only way to get to that is really to use MCMM. Now we are having people use hundreds and hundreds of scenarios with MCMM, so again the tools had to adapt to being able to do this. Once MCMM first came around it was definitely a serial process. Now there’s no way.”

Further, Sudhakar Jilla, director of marketing for place and route products at Mentor Graphics, explained that with variations in the process and variations in the design modes, each of these combinations is called a mode-corner scenario.

“Typically what happens is for each of these scenarios is you have the different design metrics sensitive to different things. For example, on Scenario One you could have you could have ‘set up;’ on Scenario Two you could have ‘hold;’ and Scenario Three you could have ‘leakage,’ and so on. Most of the time these are conflicting, so you optimize for one corner and the other corner will break. There are different ways to solve this problem,” Jilla said.

EDA tools look at all these mode-corner scenarios and across all design metrics at the same time. Typically, the traditional tools solve one, then try to fix another one. That’s normally what breaks the flow and what leads to iterations where you get a power number but the performance is off, or you get the performance but the power number is off, Jilla noted.

Looking ahead

Going forward, to continue to manage the performance/watt, the entire ecosystem is has to become involved so that everybody can understand what’s going on, Synopsys’ White asserted.

“As processes change there are so many different things to consider. At first there were only three Vts, and then there became five Vts at 28nm. With finFETs it’s looking like it probably will go back down to three again, but the channel length variance options will probably go up. So it’s all these different things that you’re looking at,” she said. “Even if you have a plug in the wall, the green initiatives are forcing people to have to watch for power. I was not very happy one day when I my DirecTV unit decided to turn itself off after an hour to save power. There are different things driving different factors. If you think about all the clouds, Amazon probably has a billion machines sitting somewhere. Its cloud may be virtual to users, but there are actual machines sitting somewhere and them all of them having to be reliably on.”

The big concern in data centers is having enough cooling power to chill the servers, and having servers that are efficient enough that they require less cooling.

“The actual power usage, is different, but saving power isn’t because there are so many green initiatives now that whether you’re playing graphics on your Playstation 3 at home or on a mobile device, they’re probably thinking about how to make sure that it’s as effective with both,” she noted. “The bottom line is that there’s definitely power savings that can be done on all fronts. The techniques used might be different but the fact that they have to do it means that power is everything.”

Hot Stuff

Thursday, March 14th, 2013

By Ann Steffora Mutschler
When it comes to thermal modeling, which has been practiced for many years, the challenges are daunting. The good news is that approaches are emerging as challenges increased with smaller process nodes and design complexity.

Viewed from a number of viewpoints—transistor, chip, package, board and system—thermal models traditionally have been created from more of a system-level perspective to look at airflow through the chassis of a computer, for instance, or airflow and cooling of a rackmounted blade server. The whole goal was to keep the junction temperature at a certain level so the chips didn’t get overheated to the point of failure.

From that system-level perspective, there are a series of single-value metrics, which were really the industry’s first stab at trying to characterize packages for thermal simulation. “Those are numbers like theta-JA, theta-JMA, theta-JC, and theta-JB,” said Byron Blackmore, product manager at Mentor Graphics. “These single-value metrics are useful in terms of providing a means to compare package A to package B, but in general they are not usable metrics from a thermal simulation perspective. They are extremely dependent on the environment in which the measurement was taken.”

One step up from the single-value metric is the two-resistor compact model, which is meant to reduce the complexity. The two-resistor model takes two of those single-value metrics, theta-JC and theta-JB, and combines them into a network of thermal resistances where the thermal engineer can specify what the power will be at the junction node and let the resistances to the case and to the board be included in the thermal analysis. That’s a big improvement over using one value, but it still has some shortcomings.

If two thermal resistances are being used, it constrains the heat from the junction to move in one of two directions. In a real 3D package, the heat certainly can move laterally within the package and ultimately into the environment, so it misses some of the physics. Still, a well-formed two-resistor compact model will be able to predict the junction temperature to within 20% of what you would get for a detailed representation for that package, said Blackmore. This approach was formalized into JEDEC standard 15-3.

As with many types of simulation, giving the tool too much data to crunch prolongs or, in some cases, prevents the simulation tool from completing its task. Determining how many objects to give the tool is key.

“It really comes down to the experience and judgment of the engineer who is running the simulation. There’s a very strong correlation between how many objects and how many grid cells you have in a simulation and how long you need to wait before you can inspect the results. In general, the fewer objects you have, the faster the simulation turnaround time will be. At the conceptual design stage where you need to evaluate tens or maybe hundreds of different design alternatives, the turnaround time is paramount. It needs to be fast to make a decision on each one of these potential designs. As it evolves, you can afford to spend more time on your simulation because you have fewer of them to run,” he said.

To deal with this, Mentor developed means over the years to appropriately simplify various pieces of geometry, such as enclosures, venting, PCBs— all common objects, and established accessible working ways to model these within its simulation tools—even to the point of automating that for the user.

Zeroing in
At the transistor level, compact thermal models have been around for some time, just as at the system level. “They were not used for MOSFETs,” said Hany Elhak, senior product marketing manager at Synopsys. They were used for some applications where thermal effects are important such as high power transistors and also in SOI (silicon on insulator). Because of the insulator in this process, thermal conductivity is bad so the transistor cannot really get rid of the heat, and that had to be taken into account. The standard compact transistor models like BSIM4, which is used to model MOSFETs, did not include thermal effects. Today of course it’s a different story because we have 16nm coming with the three-dimensional transistors.”

He pointed out that the finFET structure doesn’t allow for the heat to dissipate easily, so starting from 16nm thermal effects are becoming very important. A new compact model has been created for finFETs called BSIM CMG (common multiple gates that takes into account thermal effects and self-heating, which is how the power of the transistor can affect its temperature.

New headaches
However, the problems are increasing. “What used to be a problem for the processor alone is now becoming a problem for mobile devices,” explained Aveek Sarkar, vice president of product engineering and support at Apache Design. “The problem is that mobile devices are becoming our computers. They are running at 3GHz and they have multiple cores in them. There are quad-core chips for smart phones or tablets. There are higher performance devices that are running at a higher speed and they have multiple cores. On top of that they have graphics, which is typically more compute-intensive. Then these are getting fabricated in 28nm and 20nm, which are much more thermally sensitive process technology nodes. Temperature affects the resistance of the wire and, more importantly, it ends up affecting the electromigration of the wire, as well.”

The challenge is how to manage some of these different effects.

“Once we start to look at these effects, then the thermal modeling becomes a little different. Do we look at the system level or do we look at the chip level? When we talk about the system level then obviously we talk about some of these models that let you take certain compact models of the system and plug those into the next higher level. But these don’t help you comprehend some of the challenges that you have with the IC, the thermal analysis or the impact on the IC, because they are focusing on the chassis or the rackmounted server and they are not really focusing on the chip itself,” he said.

There are relatively few players in this niche market who serve a very small, highly expert set of people. Those customers use expert-based tools but want the ability to co-simulate with functional and power verification and power modeling. The model size and complexity prohibits them from doing this, or the license cost for a seat is too high, observed Gene Matter, senior applications manager at Docea Power.

“For most design houses, they can’t afford the seat cost,” Matter said. “Not only are the seats expensive but the token consumption and token lockout of using that tool is really prohibitive. Compact thermal models for us basically maintain the fidelity and accuracy of the design, but produce the ability to solve the thermal interaction or thermal behavior as a function of power in a shorter duration of time. The compact thermal model is also a model that can be derived very quickly from an existing source. It can import data from existing, higher-fidelity, more complex models, and it has an export function so it can be exported to other functional and power simulators such that that interaction, that feedback can occur,” he added.

How the market develops will be interesting to watch as 3D-IC by itself has caused lots of interest in thermal, noted Mentor Graphics’ Blackmore. “While simulation tools have the capability to do thermal analysis, where I see the largest barrier at this point is simply how to build these models in the simulation tools…because an explicit representation is just not going to be practical. There is a lot of interest in the industry now about how we can effectively reduce the amount of geometry that we include in the analysis, and there’s a lot of interest about how to link to EDA to automate and facilitate that model development process.”

He said it is unlikely standard will evolve to handle thermal modeling in a compact way at this level of analysis.

Along those lines, Apache’s Sarkar noted that system-level thermal analysis tools consider the whole chip to be the same temperature, which is where he believes evolution will happen as engineering teams start looking at things in more detail.

“As the industry moves more towards 20nm and the process drives more thermal sensitivity in designs, as the chip sizes—even for mobile processors or mobile tablets— become bigger and bigger, the temperature gradient from one part of the chip to another part of the chip will be more important to model. These are some of the things that will drive standards,” he concluded.

Good Times For Analog Designers

Thursday, February 14th, 2013

By Ann Steffora Mutschler
For a number of technological reasons, analog/mixed-signal design and low-power design are converging, and with that comes both challenges and opportunities.

As far as challenges go, process variations at 14nm, 20nm and even 28nm have increased significantly to include DFM impacts such as layout-delay effects. On the digital side, those process changes affect performance, and on the analog side it means a change of behavior, noted Qi Wang, technical marketing group director at Cadence Design Systems.

To account for this, designers are using more and more digital circuits or logical controls to compensate for the process variations, a technique referred to as digital assisted analog. The idea is very simple, he said: digital control logic is used to compensate for the process variations to make the analog circuit more stable and more scalable.

In addition, design teams are being challenged with issues pertaining to the area of analog circuitry as a percentage of the available die space. “We have a customer that is moving from 40nm to 28nm and then to 20, and they saw that the percentage of their analog component or mixed signal component as a percentage of the silicon real estate increased dramatically because it doesn’t shrink at the same rate as digital circuitry. By moving more and more of the analog functionality, like peripheral functionality (the core cannot change; the sensing, the RF— there’s no way) but for the rest, like a lot of A-to-D, D-to-A logic put into digital actually also reduces power. It reduces power and reduces the cost of the area of the analog/mixed-signal component. In return, you reduce the cost of the overall IC production,” Wang explained.

All of these issues are forcing analog designers to know more about digital and do mixed-signal design, he said, pointing to companies such as Dialog, ADI and Maxim that have made adjustments in their business strategy in recognition of this.

Because of these challenges, over time analog may be a smaller percentage of the overall design in terms of functionality, but it doesn’t mean its value will fall.

“If you just look at real estate wise on the silicon, definitely the trend is that analog will be less and less as a percentage,” he said. “However in terms of both technology and business level, analog will become a key differentiator for products because analog is magic. If you think about ‘the Internet of things’ it is about sensors, it’s about microcontrollers, it’s about RF components. Of these three key building blocks for the Internet of things, two of them are analog. You cannot replace the sensor with digital because that’s the physical world and the physical world is not digital. RF is communications and you could argue that it could be digital but it is still analog in custom designs—you cannot use standard-cell libraries to build it.”

Navraj Nandra, senior director of marketing for DesignWare Analog and MSIP at Synopsys, agreed: “The manufacturing requirements on these smaller technology nodes means that the design rules become quite restrictive and that means that you end up having to make layouts that are maybe the same size as previous node or in some cases bigger—so that’s a purely engineering point of view. When you go to a customer and discuss this it’s not acceptable because se they’ve heard all these presentations at conferences that Moore’s Law is still alive and well, that we can support Moore’s Law down to 7 nm.”

One of the main reasons for moving SoCs to the next process node is to cram more functionality onto a single chip.

“If you are basically saying, ‘If you want to have USB or some kind of analog/mixed-signal functionality it may not meet the area target,’ that becomes unacceptable to the SoC developer,” Nandra said. “Understanding that, we try to find a way to get the area down and so far we’ve been successful but it’s not a simple activity like it is in the digital world where you basically call up a set of libraries can pick your EDA tool and it will build you a circuit that is smaller. You have to go back and redo the architectures.”

Nandra noted that digital techniques are becoming more common in smaller technology nodes for analog blocks. “We’re seeing the amount of digital increasing in analog circuits so the design goal for analog designer, especially if they are trying to meet all the SoC area requirements in the next technology node, is to see how much of the problem they can solve in the digital world.”

There are three benefits to this. First, if it’s in the digital world it will scale. Second, digital circuits aren’t as impacted by process voltage and temperature changes. Third, the essential circuits aren’t impacted by noise as much as analog circuits. So if that problem can be solved in the digital world it can eliminate a lot of other problems, as well.

“Just looking at new architectural techniques, I think 14nm is going to be interesting because the transistor characteristics are different,” Nandra said. “There’s going to be quite a bit of invention in the next few years where analog designers are going to look at the properties of the finFET device because they are different than the planar device and say, ‘Oh, this is interesting. Maybe I can solve this particular design problem by using this finFET transistor in this particular configuration.’”

Analog scaling using digital techniques (Source: Synopsys)

Simulation for all
Smaller process nodes bring changes on many levels. Historically, most designs migrated to lower process nodes because of cost and performance, but recently it has become clear that the move to lower process nodes is due to power, cost and performance concerns.

“Power has been one of the largest driving factors for people to adopt 20nm. To look at today’s 20nm it is offering more than 25% reduction in terms of power from the previous node—that’s one of the biggest reasons why we are seeing customers going to the 20nm node,” said Arvind Shanmugavel, director of applications engineering at Apache Design.
“That being said, we also see higher levels of integration. If you look at today’s processors they are not standalone digital devices. They have multiple shared analog components on the same piece of silicon. If you look at an application processor for example, it’s got a GPS module, it’s got a radio, it’s got high-speed I/O, data converters—everything built into the same piece of silicon shared along with your digital microprocessor. We are seeing a different set of challenges both in terms of power management, in terms of reliability of these ICs, in terms of how to simulate them with the proper context of digital plus analog plus the entire system operating in unison.”

Specifically with regard to analog, some designers have encountered challenges, especially at 20nm. “Our customers are coming to us and saying that they are not able to have much leverage in terms of designing devices that have good length and width control— they can only design devices that have a multiple of a certain width and multiple of a certain length. That’s essentially because of the 20nm requirements,” Shanmugavel said.

Then on the reliability side, for power management ICs in the 20nm node the electromigration rules are 30% more stringent than at the 28nm node. “Designing a processor from the 28nm node going to 20nm was more than a speed bump; you are basically looking at a 30% reduction in the electromigration margins so that is one big aspect that our designers are facing—and it’s a big challenge,” he continued. That doesn’t even include other aspects such as ESD and EMI.

The answer, Shanmugavel believes, is a simulation-driven approach to design rather than the prevailing correct-by-construction approach. “Historically, we have seen most of the design tools do a correct-by-construction approach but with the margins in electromigration and ESD or even margins such as EMI, they have been reducing so dramatically, the designer of these ICs have to simulate these ICs with the proper conditions and then do their design. They cannot assume that they have enough margin and signoff on a particular criteria.”

Even with the challenges, the future looks bright. “It’s a great time to be an analog designer,” said Synopsys’ Nandra. “You’ve got the possibility now to really invent some new circuits because analog design, by its nature, involves very close interaction with the transistor properties as opposed to digital design. The whole point of digital design is to be far away from the transistor property. But with analog design you’re looking at the thousands of different states one transistor can actually be in. There are going to be some very interesting new circuits coming up and it’s going to be a challenge to find lots of creative analog designers because for many years the whole industry has pushed engineering towards big digital design.”

Experts At The Table: Obstacles In Low-Power Design

Thursday, December 6th, 2012

By Ed Sperling
Low-Power/High-Performance Engineering sat down to discuss low-power design with with Leah Clark, associate technical director at Broadcom; Richard Trihy, director of design enablement at GlobalFoundries; Venki Venkatesh, engineering director at Atrenta; and Qi Wang, technical marketing group director at Cadence. What follows are excerpts of that conversation.

LPHP: If you are going to 2.5D and 3D, what’s the real benefit?
Clark: One of our time-to-market struggles is getting a chip in the hands of our customers so they can play with it and tell us what features they want and what features they don’t want. So we spin it and give them the final product. Unless we get something in their hands we don’t get they kind of feedback. We would be able to do a first tapeout that would work. We wouldn’t have to fight the new technology pain. It would save us a whole mask set—in theory.
Venkatesh: I agree.

LPHP: We need software to run on all of this stuff. How much of a problem is that, because that has power implications, as well?
Venkatesh: The place to solve major power problems is the hardware-software stage. We don’t know what the power and performance will be. If you go down to the architecture you can do the hardware-software co-design. There is so much power you can save by doing that. All the leading mobile companies are saving a lot of power at the software side.
Wang: Many people don’t have the resources to put into high-level design. But if you just look at the software itself you can save a lot of power. One example is that in both iOS and Android there are power-management APIs. But the silicon designer, as well as the software designer, may not be aware of those APIs, so they can’t take advantage of them. But how do you catch the problem? You have to run power simulation for the software. There’s a modeling issue. It also depends on what platform you use to run it. Simulation is too slow. Virtual prototyping or emulation are now becoming popular.
Clark: But for switching activity you still have to run that on your model because otherwise it won’t be accurate.
Wang: But traditionally people ran functional vectors to get at switching activity. That doesn’t have any way into the application you’re profiling.
Clark: You don’t know what the software will look like until you give the hardware to your customers because they’re going to think of stuff you never thought of. That’s an iterative process. We could optimize for that.
Venkatesh: You can get power intent for each lower-level instruction, but when you add a couple of instructions together they have different power. So you have to have a real model on an instruction-by-instruction, mode-by-mode level to the point where someone writes a piece of application software you can predict how much power it will consume. That’s where a lot of work needs to be done.
Clark: You can predict for operation, but how do you know how many instructions they’re going to use and in what combination? You could optimize for instructions 1 through 10, and it turns out people are using instructions 15 through 20.
Venkatesh: You have a power number for your instructions. Then you write algorithms for different combinations. Early in the game you can make those kinds of decisions.
Wang: The low-hanging fruit is having compiler technologies that are aware of those power management APIs at the operating system level. That can be done as a first step without changing your architecture or modifying the operating system. Just being aware of the power when you do the programming.

LPHP: Don’t the teams have to be restructured? You can’t just have a hardware team handing off the design to the software team.
Venkatesh: Yes, it needs to be co-designed.

LPHP: Are you seeing realignment in the supply chain because of the things that have to be done for power?
Trihy: It goes even beyond just the software. A lot of things we’re talking about here are the problems faced by early adopters. They’re designing as the process is coming up. If it’s more established, then a lot of these things are already fleshed out and there is more collateral and more experience. What we see is customers who are co-designing their chips while we’re designing the process. That, in itself, adds to the sheer complexity of this. Everything is moving so fast and moving concurrently.
Clark: And some things take longer than others. When we get our standard cell libraries delivered we have many different flavors that have to be characterized, because we build our own libraries with the technology models. The process takes three times as long as it used to. We’re already synthesizing and designing before the library is complete.
Wang: The same thing happens at the system level. You have hardware formed and the operating system. That’s why FPGA-based prototyping is growing so well. People want to get ahead on their system before silicon comes out.
Clark: There was a push for that in the early 1990s.
Wang: It’s not new, but it’s definitely booming now. There is a very significant increase in the last two years. One reason is time to market. You need to validate your platform before it becomes silicon. Another trend is that with the SoC-level simulation there is a lot you need to do. If you get a chip and you want to run logic simulation, you will never finish.

LPHP: In the past, process technology has always bailed out designers. Is the bag of tricks as deep as it used to be?
Trihy: The finFETs are a major change. You can get significantly less power or much higher performance. You have lower leakage than planar devices, and that’s going to be a game changer at the 14nm node. Those technologies are evolving. The foundry continues to look at tweaking as much as we can, but we face new challenges. At 20nm we have double patterning. A lot of our energy goes into working with EDA vendors to make sure their flows can account for new effects we see at 20nm and below. We’re trying to pre-solve issues.

LPHP: As we move into stacked die, we start getting lots of models—power, software, TLM. Are these models synchronized?
Wang: You need to cross your fingers. Things are getting very complicated. From a design and EDA vendor side, these models eventually will consolidate and a methodology will be created. Then there will be tools to facilitate that.
Clark: Every time there’s a new cost function like power there’s churn. There are a couple different models, and the different vendors have their own. It takes a while for the industry to converge on one. Liberty is sort of an industry standard.
Venkatesh: There is a lot of work to be done in modeling. You need good models at each of the design stages. Good is defined by how compact and how accurate they are. The other piece is whether they’re in sync with other models like the timing model. Along with modeling go the estimation tools. How accurate are they in predicting power? And if you’re predicting power, how accurate is that versus silicon?

LPHP: And that information has to go in two directions, right? It has to be measured in the models and at RTL and they have to be synchronized repeatedly.
Venkatesh: These models have to be synchronized with each other and down the design chain. There is a lot of room for progress here.
Clark: We’re struggling now with the UPF-CPF issue. UPF 2.0 is more compatible with CPF, but a lot of our tools don’t support UPF 2.0 yet. We’re at 1.0 or 1.1. We’ve been working on our own source code format that we translate as a common source into UPF and CPF so we have some measure of confidence the UPF and CPF match each other. We want to use one for implementation and one for verification.
Wang: Even within the same format you have pre-DC (Design Compiler) UPF and post-DC UPF. This model validation is a problem. We’re looking at tool enhancements to check the consistency of these models.
Clark: The initial UPF is from a high-level functional standpoint. But then you add test. How do you add test to UPF with the right intent?
Wang: A lot of times people talk about CPF’s format being different. Yes, the format is different, but a lot of times so is the methodology. With DFT, UPF and CPF both have a problem. You should be able to automatically abstract out DFT logic and compare the power intent.
Clark: And you don’t want the DFT to be on when you’re in functional mode. How do you verify that? By having our own internal format our RTL and system guys don’t have to understand UPF and CPF. If we can put it in a specification format so that we can extract the right information out of it, then we can get the system architects more involved in the power architecture details.
Venkatesh: A big piece in lowering power is power intent verification. You can have a lot of great techniques, but unless you can verify that—pre-synthesis UPF, post-synthesis UPF—you will have silicon failure.
Clark: And being able to review it is critical. If a system architect writes a spec and I translate it into UPF and CPF, they can’t read my UPF file.
Trihy: When you get down to the circuit level and you want to run a power analysis tool, are you even able to go back and validate it?
Venkatesh: There are two aspects. One is how much power it is consuming. The second is the power intent. If you have a block, how many domains does it have and is there isolation logic?
Clark: When block A is on, what is block B doing? Is it on or off?
Trihy: At some point you want to predict the power. But can you actually measure it today?
Venkatesh: That’s different from the power intent. At the gate level, you can have 10% accuracy. At the RTL, you can have 20%.
Wang: It all depends on the patterns. That’s why you run emulation.
Clark: Yes, for statistical patterns.
Wang: Accuracy depends on your abstraction level. But in addition to the power intent model, the power consumption model is still a problem. Typically you look at what’s happened in the past. If you look at power consumption of IP at the bit level, you have to model it at some higher level and take into account the bus activity. That is completely new. There is a lot of research work going on there. At the cell level, we have done a pretty good job. Liberty is pretty comprehensive and accurate.
Trihy: I don’t agree. In practice Liberty is not. For timing, we have so much higher expectations that with static timing analysis we will match the frequency we get in silicon. For power, we don’t.
Wang: If you have 10 gates and you miss one gate, you’re done. But if you have 100 million gates on your chip and you miss one gate, who cares? If you look at average or dynamic power, accuracy is very crude.
Clark: But power usually isn’t a showstopper. If you use a little too much power, the battery life might not be as good as you hoped.
Venkatesh: Or you drop an application.

Chip Architect Challenges

Thursday, December 6th, 2012

By Ann Steffora Mutschler
Product lifecycles can be shorter than the design cycle and even the process development cycle, particularly in the consumer handheld device market. It’s up to the chip architect to decide how the functions should be implemented.

The good news is there are a number of options available, ranging from mapping the design to 2.5D technology, moving to finFET transistors, to making sure the design is prototyped early enough to meet the market window.

“You can’t even design an SoC before the kids want a new gadget,” observed Subi Kengeri, vice president of advanced technology architecture in the office of the CTO at GlobalFoundries. “If you probe into this a little bit as to why the product lifecycle is shorter than the design cycle, it’s because the technology has gotten so complex. To design in that complex technology, number one, the EDA is becoming more and more complex so the runtimes of all the EDA tools is not easy. Number two, every time you have more transistors available to you in a new technology and you want to integrate them to get the best out of that technology, you are creating new functions and features so the SoCs are becoming complex. Number three, you need more designers to go and design more complexity—more manpower and brainpower. It’s not just bodies; it’s not just like any design engineer can design the advanced SoCs anymore because they are so complex you need skilled talent and that’s scarce.”

To make matters worse, with advanced technologies there are so many challenges that progress is slowing. And with investments ranging from $50 million to $80 million—and sometimes much higher—ROI becomes a major concern. “You can’t simply put in $50 million to $80 million, forget about it and move on to the next node without having recouped or leveraged that investment,” he said.

The product development cycle and the product longevity always have been an interesting mix. How they go together depends on the market segment and frequently the geography, as well.

“For the consumer electronics market, where one is hyper-obsessed with the SoC in cell phones, the longevity of the product is determined by the length of the contract, which are typically two years from the user’s perspective,” said Gene Matter, senior applications manager at Docea Power. “You upgrade your phone every two years. From the OEM developer’s perspective, the challenge is to release two to four huge model offerings at least two to three times a year. Then for the poor guy who is designing the base SoC in the platforms of the systems, typically you start with one major derivative and then minor derivatives from that.”

As such, the power architect must have a good mastery of the overall cadence and sequence of major product introductions, especially in the consumer electronics market. Key to this process are the availability of new IP and a new architecture.

“For the really, really good architects, you have the five-year roadmap,” Matter said. “You’ve already sequenced this out and you shouldn’t have a lot of surprises. If you do, it means you’re not doing your job well. If you don’t understand the sequence of your products, you don’t understand the dependencies of the operating system, the process technology…then you should find something else to do.”

Defining the future

If the power architect’s job is to take the technology that is available and to define the future, then they have to look well beyond the product into portfolio management. That job is similar to the platform architect, or in years past, the chief architect whose job was akin to a master builder.

“In the old days, the master builders of churches or master builders of the pyramids or master builders of anything had a long view/portfolio view of what they were doing, Matter said. “The power architect has to have this master builder approach. By that I mean you’ve got to look at the framework and structure of a power management architecture that can transcend just a single product implementation. You’ve got to think product family, product portfolio, technology portfolio. And it means the power architect has to really up-level his thinking to frameworks that are sensible and easily extensible, and which deliver the most value for the broadest set of products and can be optimized for the products.”

Optimization itself is a major challenge to be reckoned with. The optimization point for power 10 years ago was battery life. The optimization point for power now, particularly in the mobile market, is time between recharges and idle time. Especially in regard to idle power reduction because of increases in performance, Matter explained, “we can get a task done really, really quickly. Most of the time, it’s sitting around waiting for something to go do but the user doesn’t like to wait for a response, and they get really torqued off when their battery runs down–particularly when it’s doing absolutely nothing. So now the optimization point is about ‘instant on,’ always accessible, always available, always connected.”

With server and embedded applications, different power optimization points occur that are just as complicated, including managing power budget and power delivery, thermal virtualization, managing cooling capacity and cooling budgets. As a power architect, the big focus needs to be future power challenges and what framework should be applied. What should be extended and what should be updated to reflect the new optimization points?

Device choices reflect market pressure

With product lifecycles and design lifecycles very close to being aligned, design teams can’t afford to take 18 months to crank out an ASIC because if you do, you’re dead, suggested Kirk Saban, a senior product line manager at Xilinx. “Like Samsung says on their commercial, ‘The next big thing is already here.’”

He believes this speaks to exactly why companies are investing in hardware infrastructure for FPGA-based ASIC prototyping and all of the EDA tools to go around it. They need to innovate or they will not be able to keep up with the market.

This pressure to keep up is also why Xilinx continues to generate new design wins and new revenue in what would have traditionally have been an ASIC or ASSP play, whether it be in wired communications or wireless—one of our core business segments, Saban said.

“All of those comms customers make these kinds of decisions, and in that case they are using the FPGA for a different purpose. They are not using it for ASIC prototyping. They’re actually using it in a system where they would have in the past maybe considered building an ASIC for it,” he added.

Conclusion

The challenges to power optimization extend beyond just the device itself to the framework of how power is viewed with a long-range perspective. This is the job of the SoC master builder who not only must juggle a number of choices and pick the best path to implementation for a device, but also have the ability to plan for the future derivatives. Understanding the implementation options and manufacturing options will only get more complicated as designs dive below 20nm.

How Long Will 28nm Last?

Thursday, December 6th, 2012

By Ann Steffora Mutschler
As soon as a next generation semiconductor manufacturing process node is out, bets are taken on just how long the current advanced process node will last. The 28/20nm transition is no exception.

There is certainly a benefit to moving from 40nm to 28nm. The  availability of high-k/metal gate technology offers quite a few advantages in terms of power reduction and performance improvements, particularly in the datapath and other high-speed areas.

“The high-k/metal gate process has offered design teams the ability to get reasonable (2 to 2.5GHz) speed in the datapath with good power numbers,” said Navraj Nandra, senior director of marketing for DesignWare Analog and MSIP at Synopsys. “That’s been a good addition to the process. Another factor that’s going on now is that to get low leakage power and a lower cost, TSMC and other foundries have introduced the silicon oxynitride (SiON)/polyprocess on 28nm and that is positioned as a cheaper way of getting into 28nm. It’s addressing what you could call the low end market where people are building chipsets for tablets and smartphones to offer it to a market better positioned for the price point.”

Not only is the 28nm node going strong, so is 20nm, according to Pete Hardee, low-power design solution marketing director at Cadence. Based on a survey the company took during its Low-Power Technology Summit held last month in the San Francisco Bay Area, attendees were asked which process node they were currently working on.

According to Cadence, the results were as follows:

40% — designing for 32/28nm
21% — designing for 22/20nm
16% — designing for 45/40nm
12% — designing for 65nm
11% — other

He noted that data is likely skewed for the geographic region, and is not a global number, but it shows that the 32/28nm node is definitely an extremely strong node, and that it is “relatively advanced at this life stage.” These results would seem obvious if most of these designers were in the mobile space, but interestingly, 49% of survey respondents said they were in a non-mobile market, Hardee noted.

More results from Cadence’s Low-Power Technology Summit can be found here.

Power techniques at 20nm
In terms of the power optimization and power management techniques used frequently at the 28nm node, Sudhakar Jilla, director of marketing for place and route products at Mentor Graphics, does not believe the transition from 28 to 20nm will take away any of those techniques used today. In fact, there might be a richer Vt library set from which to optimize the leakage, power optimization, etc.

But, he pointed out, that moving from bulk CMOS to fully depleted SOI provides many of the same benefits—lower voltage, lower power and equal or better performance—as moving from 28nm to 20nm bulk CMOS. “So there are different tradeoffs, and of course when it comes to power, different application segments have different power needs. But, in general, if you are going to a lower technology you can go to lower Vdd and some of these other technologies give you better power/timing tradeoffs and power/performance tradeoffs.”

Andy Inness, place and route product specialist at Mentor, noted that the power techniques at 28 and 20nm ought to be fairly similar. “You may have more options and voltages because you can go a little bit lower at 20nm. For the most part, though, it seems to be picking the right combination of node and FD-SOI and finFET, cost versus need. FD-SOI seems to have a wider range. Because you can go lower voltage at FD-SOI, it probably has a wider range of voltages to work with. So if your sleep mode is on, a wireless device can go to a lower voltage on FD-SOI than non-FD-SOI while still leaving your functional mode at the same voltage.”

That wider range gives performance versus shut down mode benefits. “You can get high performance but take advantage of the extra lower voltage for sleep mode to save the power in that because the lower voltages can give you more leakage and some other things. So being able to tolerate that lower voltage can be helpful,” he added.

The last hurrah
As design teams look to move to 20 nm, they are well aware it will be the last hurrah for planar CMOS, Hardee noted. Intel already is using Tri-Gate, its variant of finFET, at 22nm. TSMC will begin implementing finFETs at 16nm, while GlobalFoundries is adding them at 14nm.

“If you look the timescales of the planning of those nodes, there may only be about a year between 20nm being up to speed and the 16/14nm node right on its tail. This is a little quicker than the typical two years between nodes that we’ve previously seen. For that reason there may well be various considerations of whether people want to move to 20 or not.”

In addition, there’s uncertainty currently as to whether 20nm will give a performance advantage along with the promised power advantages. Leakage and variability issues loom large at 20nm and not all processes at this node are going to change the pitch as far as metallization is concerned. This means that there’s not necessarily going to be the same device shrinkage moving between nodes as there has been previously, he observed. “Do I actually get smaller chips when I move to this new node? That’s a little unclear at the moment until all that’s been fixed.”

When it comes down to it, there’s been a significant investment in terms of tools, IP and infrastructure by a lot of companies on 28nm, and they will certainly want to see a ROI on those investments. That will affect how long they stay at that node, and where they go next.

Mix-And-Match Power Options

Thursday, December 6th, 2012

By Ann Steffora Mutschler
Choices abound today when it comes to considering a node shrink. Fully depleted silicon on insulator (FD-SOI) and finFET technologies along with other advanced transistor options are being evaluated, both together and independently of the other. It is possible to implement finFET on bulk 28nm CMOS or finFET on an FD-SOI process, for example. It is also possible to implement 20nm planar CMOS with or without finFET and with or without FD-SOI.

“From a high-level perspective, the FD-SOI has to do with how the substrate and the wells are all manufactured, but that’s below where the finFET is on the gate drain side and they are independent of each other. One looks at how the gates are manufactured, and the other is what sits on top,” explained Andy Inness, place and route product specialist at Mentor Graphics. “The FD-SOI is proving to be helping even more than finFET, and especially since they can go together, we are getting to these small gate finFETs, and the isolation from the FD-SOI is helping bring down variability and also improving performance because it gives more predictability to how the wells are going to operate. It’s more isolated and therefore more predictable and less transient current, less leakage current—you get some better performance and more predictability even if it costs you a little bit in manufacturing.”

From the foundry perspective, Subi Kengeri, vice president of advanced technology architecture in the office of the CTO at GlobalFoundries, believes the power benefits of the combination of finFET and FD-SOI is impressive. “For the first time in the history of technology scaling we are actually able to reduce the operating voltage to a comfortable level. We were simply not able to reduce the voltage because of the technology issues. SRAMs would have failed previously. But now, because of the fully depleted device and the finFETs for your low voltage scaling, for example, we dropped the operating voltage 100mV from 20nm planar to finFET. That 100mW makes a big difference to our customers and applications because it has a V squared effect in terms of reducing power. You drop the operating voltage by 100mV—let’s say from 0.9 to 0.8—that’s about a 20% power reduction right there.”

Further, the finFET transistor allows the device to operate at a wide range because it is a fully deplete device and has a lower Vt compared to planar CMOS, as well as smaller random dopant fluctuations that allow for tighter variability control, he said.

With these choices, how do design teams decide what to adopt and when?

“We’ve seen a couple of customers where FD-SOI seems to be the one that they’re gravitating towards first because, from an EDA perspective, from a place and route perspective, it’s almost transparent,” said Inness. “You can almost just map the GDS and just plug in different library cells that are the same footprint, and same everything, but still gain the performance and/or the power benefits from it. You can go from 28 bulk to 28 FD-SOI on the same design and it’s nearly just a direct map. There is minor cleanup, but from a big picture there’s very little difference.”

What this will require from the tools is the understanding that certain cells cannot be placed directly next to certain other cells, depending on the internal cell design, Inness said. “All of the implementation tools need to recognize this, this pairing of cells and how they can be placed against each other. And whether it is double patterning or finFET— both of these are just rules that have to do with how close certain cells can be to what other specific types of cells. Beyond that there isn’t a lot from the implementation side that we really need to comprehend on that transistor side. There’s double patterning, but that’s more on the routing and not specific to finFET and FD-SOI.”

For designers, most of this will be handled by the tools, he said. “From a user perspective there are probably not a lot of updates, but there is a tremendous amount of R&D on how to handle these complex interrelationships. But that’s all under-the-hood stuff that the EDA companies need to worry about. There’s still a fair amount of technology needed to make it transparent to the user.”

However, to say that finFET won’t impact verification really depends how closely you are looking, said Cary Chin, director of marketing for low-power solutions at Synopsys. “Clearly, finFETs are a fundamental change in the way that we’re building transistors. We’re moving from relatively planar technologies to 3D technologies. At a very low level, when you’re looking at tools for building transistor models or for verifying technologies and things like that, there are very significant changes. For the last year or so we’ve been hearing announcements from all of the major foundries in the world and everyone’s moving to this technology. That’s because it has huge promise, especially in the area of low power. That was one of the driving forces for moving to finFETs.”

One of the big challenges is to make new technology as consistent with existing design flows as possible.

“We would like not to have to change the implementation flows too much in order to be able to scale things directly and move designs quickly,” Chin said. “At the low level things are changing a lot so all of the tools having to do with transistor level modeling, library characterization, RC extraction—all of those things that have more to do with the physical side. Even the mask generation and optimization tools on the physical side are changing a lot. If we really wanted to optimize our existing design flows for this technology we could probably milk more out of it, but that would be at the expense of some of the implementation and verification flows. Right now that’s something that we want to do. It makes more sense to build as much of a layer as we can over the technology so we continue to cruise.”

If the industry is successful at building this layer on top of the technology then the SOC or the chip designer will be relatively isolated. “But what’s going to happen is the fundamental transistor technology will be relatively insulated from but all of the design parameters that go along with building more and more effective low-power kinds of designs and are going to become more important,” Chin said. He noted that over the next few years more design time will be spent looking at low-power designs versus high-performance tradeoffs. This comes right back around to finFETs that promise better performance at the same power or lower power—performance can be traded for power.

Essentially, he believes we’re in pretty good shape for finFETs. “There’s a lot of work that’s already been done making the higher-level design – the traditional chip design process –relatively transparent, which is great. Because we’re going to have these technologies available to us in the next five years, it’s going to enable us to do a high-level rework of the tools and continue to push on the idea of integrated power optimization into the flow.”

Getting Ready For 15nm

Thursday, October 7th, 2010

By David Lammers
The trends towards vertical transistors, non-silicon channel materials, and resistive RAMs promise to hold center stage at the 2010 IEEE International Electron Devices Meeting (IEDM), set to begin Dec. 6 in San Francisco, Calif. (www.ieee-iedm.org)

Taiwan Semiconductor Manufacturing Co. (TSMC, Hsinchu, Taiwan) will present a 22/20nm technology platform based on a FinFET architecture. The TSMC paper describes a full CMOS technology, complete with silicon germanium stressors, high-k/metal gate, and dual-epitaxy technology. TSMC said it demonstrated a 0.1µm2 SRAM cell, which operated at a 0.45V operating voltage (Vmin) with a 90 mV noise margin.

While TSMC is expected to shift from today’s planar transistors to the vertical FinFET devices at the 14nm generation in the 2015 time frame, the IEDM 22/20nm paper demonstrates that the world’s leading foundry has the FinFET manufacturing challenges well in hand. TSMC used 193nm immersion lithography to achieve NMOS and PMOS drive currents of 1200/1100 µA/µm respectively, at off-currents of 100 nA/µm.

Fig. 1: TSMC will unveil a complete FinFET-based 22/20nm CMOS logic technology at IEDM 2010. Electron microscope images show a cross-section of the vertical fin’s sidewall.

Fig. 1: TSMC will unveil a complete FinFET-based 22/20nm CMOS logic technology at IEDM 2010. Electron microscope images show a cross-section of the vertical fin’s sidewall.

While creating 20nm gate-length vertical transistors is “demanding,” due to parasitic capacitances and other challenges, an abstract of the TSMC paper said the FinFET architecture allows continued scaling with good electrostatic control of the channel. To accomplish its scaling goals, TSMC turned a series of process technology knobs, including embedded SiGe to strain the PMOS channel, stress memorization techniques in the NMOS devices, an optimized contact edge stop layer (CESL), dual work functions, and both epitaxial silicon and boron-doped e-SiGe in the source and drain regions. Compared with planar transistors, the TSMC paper will describe much (100x) improved leakage from the source and drain regions, critical for low-power mobile systems.

Intel and IQE Inc. researchers will describe their latest advances with a FinFET architecture based on an InGaAs quantum well technology. At the 2009 IEDM, Intel described a surface-channel InGaAs FinFET. The quantum well InGaAs FinFET features fins, which are 35nm-wide and smaller, 5nm gate-to-drain and gate-to-source separations, and a high-k gate dielectric.

Intel and its research partner have been developing quantum-well compound devices as successors to silicon CMOS. The paper to be presented at the 2010 IEDM takes the InGaAs technology from a planar to a FinFET architecture, which delivers much-improved control of the channel compared with the planar devices described at the previous meetings. Also, the paper describes a high-k dielectric with a Tox of 20.5 Angstroms and good interface properties.

An InGaAs MOSFET will be presented by a team led by the University of Tokyo. The device features a 3.5nm channel, the smallest such device to be described thus far. The dual-gate device was created on a silicon substrate using wafer bonding.

Memories taking resistive turn
On the memory front, researchers from Intel and Micron Technology have developed a 25nm multi-level cell (MLC) NAND memory technology, with a cell size of 0.0028 µm2 – the smallest transistor now in production. An air gap was introduced between word lines to control the word line-to-word line capacitance and cell-to-cell interference.

The MLC device uses only 30 to 40 electrons per level, which requires advancements in the insulating tunnel oxide and the inter-poly dielectric in order to confine the charges. The cell has an asymmetric design, with a word line half pitch of 24.5nm and a 28.5nm half pitch in the bit line direction, allowing for insertion of the control gate between the floating gates. The technology is used for 64-Gbit NAND memories.

The authors will describe how the Intel-Micron team dealt with dopant fluctuations, structural bending, and other challenges presented at such small dimensions.

Fig. 2: Researchers from Intel and Micron Technology will describe the 25nm 64Gbit multi-level cell (MLC) NAND technology. The image shows the select gate and contacts in the bit line direction.

Fig. 2: Researchers from Intel and Micron Technology will describe the 25nm 64Gbit multi-level cell (MLC) NAND technology. The image shows the select gate and contacts in the bit line direction.

Resistive RAMs (RRAMs), which use a voltage to alter the resistive state of metal-based compounds, have emerged as a path to higher-density non-volatile memories once NAND flash scaling reaches its limit. A functional transition-metal-oxide resistive memory (TMO-RRAM) developed at the National Nano Device Laboratories in Taiwan has a record 9nm half-pitch, with a programming current of less than 1 µA, which compares with about 20 mA for phase-change memories. The researchers controlled the device’s resistivity by changing the chemical composition of the tungsten-oxide layer. They postulate that the memory’s change in resistance is due to the controlled movement of oxygen ions, with a monotonically varying ratio of oxygen and tungsten atoms.

The Taiwan laboratory’s research team includes Chinming Hu, a professor at the University of California, Berkeley. In an abstract of the paper, they said the “unexpectedly low” 1 µA current required to set and reset the RRAM cell makes it a promising candidate for low-power non-volatile memories.

The reported progress with exploratory RRAMs comes amid concerns about power consumption with the phase-change RAMs (PC-RAMs), which use heat to change the resistive state of a chalcogenide material. At IEDM, a team from the IBM/Macronix PCRAM Joint Project will describe a previously unknown failure mechanism for phase-change memories, apparently related to electromigration stemming from the polarity of the operating current.

At the high current densities required to change the state of the chalcogenide material, the researchers found that hole-induced electromigration occurs when current polarity is reversed. The paper claims that the phenomenon causes voids at the interface between the phase-change material and the bottom electrodes, limiting their cycling endurance by four orders of magnitude. The team also will discuss countermeasures to deal with the effect.

IBM researchers also will describe their latest-generation SOI-based embedded DRAM (eDRAM), enhanced with a high-k/metal gate technology. Big Blue claims eDRAM delivers several advantages over SRAM for large on-chip caches, including higher density, better soft error rates, and lower power consumption. The performance rivals SRAM speeds, with the SOI eDRAM delivering a sub-1.5ns latency and 2ns cycle time.

The 32nm eDRAM uses a deep trench capacitor with 25 percent higher capacitance and much less resistance than conventional memory stacks based on SiON/poly gate stacks. IBM said it use of a high-k/metal gate technology to reduce leakage and control the threshold voltage of 40 mV. IBM created a 32 Mbit array from cells measuring 0.39 µm2. The eDRAM is 3-4x smaller than a comparable SRAM, enabling a much-higher density on-chip cache, the abstract of the paper said.