You are on page 1of 8

IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 54, NO.

4, APRIL 2007

715

Power Optimization for SRAM and Its Scaling


Eiji Morifuji, Member, IEEE, Dinesh Patil, Student Member, IEEE,
Mark Horowitz, Fellow, IEEE, and Yoshio Nishi, Fellow, IEEE

AbstractWith technology scaling, there is a strong demand for


smaller cell size, higher speed, and lower power in SRAMs. In
addition, there are severe constraints for reliable read-and-write
operations in the presence of increasing random variations that
significantly degrade the noise margin. To understand these tradeoffs clearly and find a power-delay optimal solution for scaled
SRAM, sequential quadratic programming is applied for optimizing 6-T SRAM for the first time. We use analytical device models
for transistor currents and formulate all the cell-operation requirements as constraints in an optimization problem. Our results
suggest that, for optimal SRAM cell design, neither the supply
voltage (Vdd ) nor the gate length (Lg ) scales, due to the need for
an adequate noise margin amid leakage and threshold variability
and relatively low dynamic activity of SRAM. This is true even
with technology scaling. The cell area continues to scale despite
the nonscaling gate length (Lg ) with only a 7% area overhead at
the 22-nm technology node as compared to simple scaling, at which
point a 3-D structure is needed to continue the area-scaling trend.
We also find that the suppression of gate leakage helps to reduce
the power in ultralow-power SRAM, where subthreshold leakage
is minimized at the cost of increase in cell area.
Index TermsCMOS memory integrated circuits, logic devices,
power consumption, SRAM chips.

I. INTRODUCTION

RAM is a critical component in almost all digital systems,


from high-performance processors to mobile-phone chips.
In these applications, density, power, and performance are
all critical parameters. Historically, power for digital logic,
which is dominated by dynamic power, has been reduced by
lowering the supply voltage (Vdd ) [1]. The supply voltage
for digital-logic portion has reached around 1 V [2][4]. Vdd
scaling reduces dynamic power, but due to a much lower
switching-activity per cell, leakage power dominates in SRAM.
Besides, there are severe constraints on cell noise margins
for reliable read-and-write operation. Also, as device size is
scaled, random-process variations significantly degrade the
noise margin. Considering all these effects, the bit yield for
SRAM is strongly influenced by Vdd , threshold voltage (Vth ),
and transistor-sizing ratios [5]. Therefore, it is complicated to
determine the optimal cell design for SRAM. To understand

Manuscript received April 4, 2006; revised August 9, 2006. This work


was supported in part by the MARCO Focus Center for Circuit and System
Solutions (C2S2, www.c2s2.org). The review of this paper was arranged by
Editor T. Stotnicki.
E. Morifuji was with the Center for Integrated Systems, Department of
Electrical Engineering, Stanford University, Stanford, CA 94305 USA. He
is now with the System LSI Division, Semiconductor Company, Toshiba
Corporation, Yokohama 235-8522, Japan (e-mail: eiji.morifuji@toshiba.co.jp).
D. Patil, M. Horowitz, and Y. Nishi are with the Center for Integrated
Systems, Department of Electrical Engineering, Stanford University, Stanford,
CA 94305 USA.
Digital Object Identifier 10.1109/TED.2007.891869

these tradeoffs better and to find an optimal solution for scaled


SRAM, we have established a new optimization procedure
based on sequential quadratic programming (SQP) for 6-T
SRAM. We perform optimization in terms of power, cell area,
and speed. At first, we represent the currentvoltage (IV )
characteristics of scaled transistors with a number of analytical
models and formulate all the metrics, such as power, noise
margin, cell area, and read speed as constraints in the optimization. There are also other constraints to ensure proper
read-and-write stability of the SRAM. The problem can be
set to make any of the metrics as the objective and others as
constraints. The optimizer returns the optimal device lengths
and widths and supply and threshold voltages, which are the
design variables. By changing one of the constraining metrics,
we obtain optimal tradeoff curves between various metrics. In
addition, we obtain the trend of these curves as the design
rules are scaled with advancing technology. The results indicate
that, for these models, the optimal approach for obtaining lower
power for a constant read speed is to effectively stop scaling Vdd
and gate lengths but, in fact, to increase Vdd slightly. The benefit
of 3-D device structure for maintaining the area scaling and use
of high-k material for reducing gate leakage is also confirmed
by changing the device parameters in the optimizer. We propose
two sets of device scaling for SRAM in this paperone for
ultralow-power SRAM and the other for high-density SRAM.
In the rest of this paper, we describe the modeling of the
SRAM and the optimization methodology, which includes the
formulation of various metrics as constraints, followed by
the results and analysis.
II. MODELING OF THE SRAM AND
OPTIMIZATION METHODOLOGY
We use the SQP-based optimizer in MATLAB to tune our
design variables to achieve minimum power for the system for
various values of cell area. In order to do this for various 6-T
SRAM in a unified manner, we include all the constraints that
ensure circuit operation and correct device-scaling behaviors.
These constraints contain key variables such as Vdd , Lg , oxide
thickness (Tox ), Vth , and gate widths (Wg s) for a 6-T SRAM.
Device models include modeling for short-channel effect, gate
leakage, junction leakage, and mismatch. We develop analytical
model for describing key metrics such as cell area, power, noise
margin, and read speed. We base our device models on 65-nm
CMOS technology published in [6][9]. For checking the
validity of the model and the method, the power obtained
by this methodology is calibrated with SRAM power per bit
provided in [9], where SRAM power is shown with different cell area, Vth , Tox , and Vdd in 65-nm technology case.

0018-9383/$25.00 2007 IEEE

716

IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 54, NO. 4, APRIL 2007

The following sections describe the various constraints of our


optimization.
A. Device Models and Constraints
We use the following IV equations including the velocitysaturation effect and body factor for linear current (Idlin ) and
saturation current (Idsat ) of bulk CMOS transistors [10]:
 

W
2
e Cox Lgg
V
(Vg Vth )Vds m
2 ds


Idlin =
(1)
eff Vds
1 + vsat Lg
Idsat = Cox Wg vsat (Vg Vth )


1 + 2e (Vg Vth ) (mvsat Lg ) 1


1 + 2e (Vg Vth ) (mvsat Lg ) + 1

(2)

where e is the effective mobility, m is the body factor, Cox


is the gate capacitance, and vsat is the saturation velocity. We
estimate the effective mobility by using the universal MOSFET
inversion-layer carrier-mobility models expressed as a function
of Vgs , Vth , and Tox found in [11]. This model is calibrated with
actual MOSFET characteristics [8], [9]. Minimum achievable
Lg and Tox for 65-nm CMOS-technology node are assumed
to be 37 and 1.2 nm, respectively [1]. In order to ensure
the reliability of MOSFET during the optimization, maximum
allowed vertical and lateral electrical fields are specified. These
are defined as a function of Vdd and are extracted by plotting
the published technology parameters for 45-, 65-, 90-, and
130-nm nodes [8], [12][14]. Constraints for short-channel
effect are also specified using the drain-induced barrier lowering of 80 mV/V and S-factor of 100 mV/dec. These parameters
can be estimated by evaluating the normalized channel length
() [15], which is the function of Lg , Tox , and substrate concentration (Nsub ). The constraint assumed here corresponds to
Lg / > 1.5 for controlling short-channel effects. Gate leakage
for nitrided SiO2 is estimated by the empirical relationship

2
Vox
eTox /Vox
(3)
Ig = Ao
Tox
where Ao and are extracted by fitting the actual published
data [8], [9]. The junction leakage is also modeled by the
empirical relationship


(4)
Ij Nsub eqV /kT 1 .
The junction leakage is mostly dominated by the peripheral
factor especially in SRAM, because advanced technology puts
an effort to suppress the short-channel effect by increasing
the channel concentration only at the surface of Si, which is
shallower than the depth of the drain junction. In this paper, the
junction leakage is proportional to gate width, because we take
only the peripheral junction leakage into account. This model
for the junction leakage is calibrated with data found in [6]. In
this paper, band-to-band tunneling (BTBT) current is not taken
into account for simplicity because reverse p-n junction leakage

is still a major player especially in high-temperature operation.


If the BTBT current became more critical in the future, it can
be easily taken into account. In addition, variation of the device
parameter can significantly influence the circuit behavior in
SRAM. We take mismatch among the transistors in a cell or
in a memory macro by considering a normally distributed Vth
with standard deviation (Vth ) given by
tox N 0.4
(Vth ) = 3.19 108  sub
Lg Wg

(5)

as derived in [16].
Off current in a single MOSFET (Io ), as a function of Vth ,
Wg , and Lg , is expressed as


kT
Wg
Io = e Cox
(m 1)
(6)
eqVth /mkT .
Lg
q
The total Io is a sum over all the SRAM cells. With
Gaussian Vth , Io has a lognormal distribution. Assuming
weak correlation between neighboring cells, we can estimate
the average leakage by summing all the lognormal distributions and taking the average. The Vth mismatch obtained from
(5) is used in order to correctly represent the off current in the
chip. The probability-density function of Io is given as


(log(Ioff ))2
1
2
2
10
f (Io ) =
x 2
where and are the mean and standard deviation, respectively, of log(Io ), which is proportional to the random variable
Vth . The average off current E(Io ) is then given by
E(Io ) = 10(+

/2)

(7)

which is used as our leakage estimate.


B. Model and Constraints Related to SRAM Layout
Fig. 1 indicates the 6-T SRAM cell configuration and it is
the most typically used symmetric layout. The area of the cell
is calculated as a product of vertical and lateral dimensions,
which are broken down into components of key design rules
defined in Fig. 1 and the device geometry (Lg and Wg ) for
driver, transfer, and load transistors. Design rules assumed in
this paper are those for 65-nm technology node, where the half
pitch of the critical layer is 90 nm. All widths are constrained
to be bigger than the contact-hole size.
C. Model and Constraints Related to SRAM Operation
The cell-power dissipation is expressed as a sum of averaged
leakage (computed statistically for all transistors) and dynamic
power. For the activity factor, we consider a 64-kb memory
with a random 64-b word accessed every cycle. Standby power
is the sum of the subthreshold, gate, and junction leakages.
In the standby mode, Io at Vds = Vdd , Vgs = 0 flows in the
load and driver transistors, while the off current flowing in
the transfer transistors is different. We use SPICE to obtain
the condition for the worst case bit-line leakage through the

MORIFUJI et al.: POWER OPTIMIZATION FOR SRAM AND ITS SCALING

717

Fig. 1. Configuration of 6-Tr SRAM and SRAM layout studied in this paper. Cell area for SRAM can be expressed as a function of Lg and Wg for each Tr and
design rules.

Fig. 3. Sensitivity of mismatch on the variation of SNM obtained by the


Monte Carlo simulation in SPICE.

Fig. 2. Setup for investigating the worst transfer leakage in SRAM. Shown in
the right-hand side indicates the leakage and bit-line voltage as a function of
number of bits having 1 state in a single bit line. Leakage is dependent on the
ratio between the number of nodes having 1 and 0 states.

transfer devices. The SPICE model used in this paper is based


on the characteristics of the 65-nm technology node [8]. Fig. 2
indicates the simulation setup and results. We assume 512 cells
connected to a bit line with M of the cells storing a one
and (512 M ) bits storing a zero. The total leakage observed
through the transfer devices and bit-line voltage is plotted as
a function of M ; the point of maximum leakage is chosen.
Bit-line voltage in the maximum leakage condition is 30 mV.
Therefore, transfer leakage in the worst case is concluded
to be the off current at linear operation, which is estimated
by replacing Vth by Vth + DIBL in (6), because Vth + DIBL
represents the threshold voltage at linear operation.
The formulation of static noise margin (SNM) has been
shown by many groups [17][19]. We estimate the SNM by
calculating the node voltage at read cycle (VOL) and the turnon voltage when the state transition begins (VOH) [18]. SNM
is equivalent to the voltage difference between VOL and VOH.
In order to estimate the impact of mismatch on SNM, Monte
Carlo simulation is performed in SPICE. Here, mismatch is
introduced from (5). Fig. 3 shows the sensitivity of the variation
of SNM (y-axis) to mismatch (x-axis). Mismatch of driver
transistors affects the SNM variation in the 6-T SRAM the

most. The relationship between the SNM variation and transistor mismatch is taken into account by defining the worst
noise margin as the lower 6 value of SNM in the Gaussian
distribution.
Cell current at read cycle is estimated by calculating the
linear draincurrent for driver transistor with Vgs = Vdd and
Vds = VOL. Four constraints related to the SRAM operation
are used in the optimization.
1) Bit leakage: In order to guarantee read operation, we
ensure that the total bit-line leakage in the case when
512 b are connected together to a single bit line must be
less than the cell current with by at least two orders of
magnitude.
2) Noise margin: We constraint the worst SNM to be larger
than 10% of the supply voltage, which is the standard tolerance required for the fluctuation of the supply-voltage
line.
3) Read speed: Read speed is ensured by constraining the
cell current normalized to the bit-line capacitance (since
the capacitance will change with the cell area) to be
greater than the one corresponding to a 300-MHz operation. We also include the effect of changing bit capacitance on the dynamic power dissipation.
4) Write ability: We make sure that the optimizer uses the
correct device strengths for write ability by ensuring that
saturation current for transfer at the slow corner is larger
than that for the load transistor at the fast corner.

718

IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 54, NO. 4, APRIL 2007

Fig. 4. Optimized power for SRAM as a function of the cell area in 65-nm
technology node.

Fig. 6. Standby power is broken down into each component. As cell area is
increased, the gate and junction leakages become dominant.

Fig. 7. Optimized power for SRAM as a function of cell area and read speed
in the 65-nm technology node.
Fig. 5. Optimized variables to achieve minimum power plotted by the
cell area. (a) Gate width. (b) Gate length. (c) Supply voltage. (d) Threshold
voltage.

D. Assumptions on Scaling
For scaling, we assume a 70% shrink of all design rules
for each technology generation. The optimizer is set to keep a
constant read speed of 300 MHz with scaling. Power consumed
with the read-and-write cycle, only in a bit cell, is considered.
We assume constant wiring capacitance per length for a minimum width metal line, as we scale technology.
III. RESULTS AND DISCUSSIONS
Fig. 4 shows the optimized power plotted against the bitcell area in 65-nm technology node. There is a clear tradeoff
between the cell size and power. Fig. 5 shows the optimum
values for Wg , Lg , Vdd , and Vth as a function of cell area.
Optimum Vdd is 1.11.3 V for all cell areas. In most memory
macros, except for very small memories like register files and
local cache subbanks, standby power dominates the total power
due much lower switching activity. Optimizer increases the gate

width for the driver and threshold voltage for all the transistors
as the cell size is increased. Thus, appropriate cell read current
is ensured by increasing the width of the driver transistor as
Vth is increased to suppress subthreshold leakage. Lg is also
increased to suppress the mismatch factors as cell area is
increased. The upper limits of Vth for the driver and transfer
transistors are determined by the criteria of cell current. The
upper limit of Vth for load transistor is constrained by the SNM,
which degrades with the increase of Vth because of mismatch.
The lower limit of Lg for all transistors is constrained by the
SNM, which is mainly constrained by mismatch. A minimum
width is selected for the load transistor because of the writeability constraint. The width of the transfer transistor is set by
the cell current and SNM constraints. Fig. 6 summarizes the
breakdown of the optimized standby power as a function of cell
area. In smaller cells, off leakage dominates because Vth must
be reduced to meet cell-current constraint with limited gate
width. As the cell size and gate width is increased, Vth can be
increased, so gate leakage and junction leakage dominate. This
corresponds to the actual power obtained in the 65-nm technology case provided by [9]. Fig. 7 shows the optimized power

MORIFUJI et al.: POWER OPTIMIZATION FOR SRAM AND ITS SCALING

Fig. 8.

719

Optimized variables for minimizing standby power for SRAM from 65- to 22-nm technology nodes. (a) Supply voltage. (b) Gate length. (c) Gate width.

Fig. 10. Breakthrough items for SRAM scaling and estimated benefit in power
in 22-nm technology node.

Fig. 9.

Scaling trend of the cell area and keeping the same power per bit.

versus bit-cell area and read speed in 65-nm technology node.


There are clear tradeoffs among cell size, speed, and power.
As read speed is increased, subthreshold leakage becomes
dominant in total power in order to increase the cell current.
In case of slower read operation, the power is almost constant
because Vth is high enough, so the junction leakage and the
gate leakage are dominant. Fig. 8 shows optimum points for
variables to minimize the power from 65- to 22-nm technology
node. We can see clearly that scaling of Vdd and Lg is not
beneficial in the bulk-CMOS-technology case, and only Wg
should be scaled if the power consumed per bit is kept constant.
Fig. 9 gives the optimized power plotted versus cell area, for a
number of different technology nodes. The fixed Lg causes the
cell to scale slightly slower than 2 each generation, but the
overheads are initially small (2% and then 4%). The increase of
cell size in the power optimization is mainly caused by the Wg
increase, which means, by the 22-nm generation, further scaling
will be limited by device widths. At this point, some kind of
3-D channel structure will probably be needed. A 3-D channel,
as shown in Fig. 10, achieved by selective epitaxial growth
is studied as a promising candidate. We take into account the
mobility change in different orientations [20], [21]. Suppression
of the gate leakage by applying ideal high-k material is also
investigated. Fig. 10 indicates the impact of the 3-D channel
and gate dielectric without gate leakage on power and cell area
in 22-nm technology node. It should be noted that epichannel
decreases the cell size without the power penalty. The gate
dielectric without the gate leakage greatly helps cells optimized
for power (instead of area) but will have a small effect of those

small high-density memories where the subthreshold leakage


dominates, as these memories have their Vdd and Vth scaled
with technology. This can also be understood by checking the
breakdown of total power, as shown in Fig. 6. Fig. 11 gives
the breakdown of the bit-cell area obtained in this optimization
as technology advances. If chip size is ideally shrunk just by
multiplying the same shrinking rate in the lateral and vertical
directions, a 70% target is needed to achieve area shrink rate of
50% per advance. But in the SRAM case, this simple shrinking
rule is not applicable because of many constraints and power
minimization as shown. In the vertical length, the scaling ratio
becomes worse than 70% as technology advances. This is due to
the nonscaling of Lg , while the shrink rate in lateral direction
is 72% from 65 to 22 nm, which is close to the ideal target
(70%). This is because Wg is scaled. By applying 3-D structure,
the shrink rate in the lateral direction becomes lower than
70%. This helps to lighten the penalty of the vertical direction
caused by nonscaling of Lg . Fig. 12 summarizes the scaling
trend for high-density and ultralow-power memories. Ultralowpower memory, where the subthreshold leakage is minimized
by sacrificing about 10% area from the high-density version,
offers 60%70% power reduction as technology advances. In
both cases, scaling of the supply voltage is almost stopped. It
should be noted that memories having a very small size, such
as high-speed cache banks and register file or multiport SRAM,
act at high-switching activity. Hence, their scaling would follow
the scaling trends of the logic transistors.
IV. CONCLUSION
From our optimization methodology and models for 6-T
SRAM, we observe a clear tradeoff between the cell area

720

IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 54, NO. 4, APRIL 2007

Fig. 11. Breakdown of the bit cell in the lateral and vertical directions.

Fig. 12. Obtained scaling roadmap for SRAM. Epitaxial channel to form 3-D channel structure is beneficial for the area scaling and the promising candidate.
High k will greatly help cells which have been optimized for power but will have a small effect to those with high-density memories where subthreshold leakage
dominates.

and power. Very low-switching activity, device variability, and


exponential dependence of leakage on Vth dictate that scaling
for Lg and Vdd should be paused, especially in large memory
macros. Initially, this discontinued channel-length scaling will
only have a small effect on the scaling of the memory cell
size as most of the size scaling will come from the scaling of
design rules until around 22 nm. Eventually, to allow the cells
to continue to scale with the transistor widths not scaling, some
3-D technology will be needed. While the best option is hard
to determine at this time, an epichannel looks like a promising
approach. The suppression of the gate leakage helps to reduce
the power in ultralow-power SRAM, where the subthreshold
leakage is adequately minimized with appropriate increase of
cell area. In the SRAM region, high-drive current in a small

area and low mismatch are important. This requirement is the


same with the logic-transistor requirement, so the changes in
device architecture to maximize the logic performance are also
advantageous for SRAM. In such case, the careful scaling of
Lg /Wg , Vdd , and Vth with considerations of power and other
constraints is inevitable, as shown in the bulk-technology case
presented in this paper.
ACKNOWLEDGMENT
The authors would like to thank Prof. P. Wong at the Center
for Integrated Systems, Department of Electrical Engineering,
Stanford University, for his support in the brush-up of this
paper.

MORIFUJI et al.: POWER OPTIMIZATION FOR SRAM AND ITS SCALING

721

R EFERENCES
[1] The International Technology Roadmap for Semiconductors, 2004.
[2] P. Bai, C. Auth, S. Balakrishnan, M. Bost, R. Brain, V. Chikarmane,
R. Heussner, M. Hussein, J. Hwang, D. Ingerly, R. James, J. Jeong,
C. Kenyon, E. Lee, S.-H. Lee, N. Lindert, M. Liu, Z. Ma, T. Marieb,
A. Murthy, R. Nagisetty, S. Natarajan, J. Neirynck, A. Ott, C. Parker,
J. Sebastian, R. Shaheed, S. Sivakumar, J. Steigerwald, S. Tyagi,
C. Weber, B. Woolery, A. Yeoh, K. Zhang, and M. Bohr, A 65 nm logic
technology featuring 35 nm gate lengths, enhanced channel strain, 8 Cu
interconnect layers, low-k ILD and 0.57 m2 SRAM cell, in IEDM Tech.
Dig, 2004, pp. 657660.
[3] Z. Luo, A. Steegen, M. Eller, M. Mann, C. Baiocco, P. Nguyen, L. Kim,
M. Hoinkis, V. Ku, V. Klee, F. Jamin, P. Wrschka, P. Shafer, W. Lin,
S. Fang, A. Ajmera, W. Tan, D. Park, R. Mo, J. Lian, D. Vietzke,
C. Coppock, A. Vayshenker, T. Hook, V. Chan, K. Kim, A. Cowley,
S. Kim, E. Kaltalioglu, B. Zhang, S. Marokkey, Y. Lin, K. Lee, H. Zhu,
M. Weybright, R. Rengarajan, J. Ku, T. Schiml, J. Sudijono, I. Yang,
and C. Wann, High performance and low power transistors integrated in
65 nm bulk CMOS technology, in IEDM Tech. Dig, 2004, pp. 661664.
[4] A. Chatterjee, J. Yoon, S. Zhao, S. Tang, K. Sadra, S. Crank, H. Mogul,
R. Aggarwal, B. Chatterjee, S. Lytle, C. T. Lin, K. D. Lee, J. Kim,
Q. Z. Hong, T. Kim, L. Olsen, M. Quevedo-Lopez, K. Kirmse, G. Zhang,
C. Meek, D. Aldrich, H. Mair, M. Mehrotra, L. Adam, D. Mosher,
J. Y. Yang, D. Crenshaw, B. Williams, J. Jacobs, M. Jain, J. Rosal,
T. Houston, J. Wu, N. S. Nagaraj, D. Scott, S. Ashburn, and A. Tsao,
A 65 nm CMOS technology for mobile and digital signal processing
applications, in IEDM Tech. Dig., 2004, pp. 665668.
[5] E. Morifuji, T. Yoshida, H. Tsuno, Y. Kikuchi, S. Matsuda, S. Yamada,
T. Noguchi, and M. Kakumu, New guideline of Vdd and Vth scaling for
65 nm technology and beyond, in Proc. Symp. VLSI Dig. Tech. Papers,
2004, pp. 164165.
[6] N. Yanagiya, S. Matsuda, S. Inaba, M. Takayanagi, I. Mizushima,
K. Ohuchi, K. Okano, K. Takahasi, E. Morifuji, M. Kanda,
Y. Matsubara, M. Habu, M. Nishigoori, K. Honda, H. Tsuno,
K. Yasumoto, T. Yamamoto, K. Hiyama, K. Kokubun, T. Suzuki,
J. Yoshikawa, T. Sakurai, T. Ishizuka, Y. Shoda, M. Moriuchi, M. Kishida,
H. Matsumori, H. Harakawa, H. Oyamatsu, N. Nagashima, S. Yamada,
T. Noguchi, H. Okamoto, and M. Kakumu, 65 nm CMOS technology
(CMOS5) with high density embedded memories for broadband microprocessor applications, in IEDM Tech. Dig., 2002, pp. 5760.
[7] M. Kanda, E. Morifuji, M. Nishigoori, Y. Fujimoto, M. Uematsu,
K. Takahashi, H. Tsuno, K. Okano, S. Matsuda, H. Oyamatsu,
H. Takahashi, N. Nagashima, S. Yamada, T. Noguchi, Y. Okamoto, and
M. Kakumu, Highly stable 65 nm node (CMOS5) 0.56 m2 SRAM cell
design for very low operation voltage, in Proc. Symp. VLSI Dig. Tech.
Papers, 2003, pp. 1314.
[8] E. Morifuji, M. Kanda, N. Yanagiya, S. Matsuda, S. Inaba, K. Okano,
K. Takahashi, M. Nishigori, H. Tsuno, T. Yamamoto, K. Hiyama,
M. Takayanagi, H. Oyamatsu, S. Yamada, T. Noguchi, and M. Kakumu,
High performance 30 nm bulk CMOS for 65 nm technology node
(CMOS5), in IEDM Tech. Dig., 2002, pp. 655658.
[9] K. Utsumi, E. Morifuji, M. Kanda, S. Aota, T. Yoshida, K. Honda,
Y. Matsubara, S. Yamada, and F. Matsuoka, A 65 nm low power CMOS
platform with 0.495 m2 SRAM for digital processing and mobile applications, in Proc. Symp. VLSI Dig. Tech. Papers, 2005, pp. 216217.
[10] Y. Taur and T. H. Ning, Fundamentals of Modern VLSI Devices.
Cambridge, U.K.: Cambridge Univ. Press, 1998.
[11] K. Chen, C. Hu, P. Fang, M. R. Lin, and D. L. Wollesen, Predicting
CMOS speed with gate oxide and voltage scaling and interconnect loading
effects, IEEE Trans. Electron Devices, vol. 44, no. 11, pp. 19511957,
Nov. 1997.
[12] H. Yoshimura, T. Nakayama, M. Nishigohri, M. Inohara, K. Miyashita,
E. Morifuji, A. Oishi, H. Kawashima, M. Habu, H. Koike, H. Takato,
Y. Toyoshima, and H. Ishiuchi, A CMOS technology platform for
0.13 m generation SOC (system on a chip), in Proc. Symp. VLSI Dig.
Tech. Papers, 2000, pp. 144145.
[13] A. Oishi, R. Hasumi, Y. Okayama, K. Miyashita, M. Oowada, S. Aota,
T. Nakayama, M. Matsumoto, N. Inada, T. Hiraoka, H. Yoshimura,
Y. Asahi, Y. Takegawa, T. Yoshida, K. Sunouchi, A. Yasumoto,
Y. Tateshita, M. Ueshima, T. Morikawa, T. Umebayashi, T. Gocho,
F. Matsuoka, T. Noguchi, and M. Kakumu, MOSFET design of
100 nm node low standby power CMOS technology compatible with
embedded trench DRAM and analog devices, in IEDM Tech. Dig., 2001,
pp. 507510.
[14] M. Iwai, A. Oishi, T. Sanuki, Y. Takegawa, T. Komoda, Y. Morimasa,
K. Ishimaru, M. Takayanagi, K. Eguchi, D. Matsushita, K. Muraoka,

[15]
[16]

[17]
[18]
[19]
[20]
[21]

K. Sunouchi, and T. Noguch, 45 nm CMOS platform technology


(CMOS6) with high density embedded memories, in Proc. Symp. VLSI
Dig. Tech. Papers, 2004, pp. 1213.
D. J. Frank, R. H. Dennard, E. Nowak, P. M. Solomon, Y. Taur, and
H. S. P. Wong, Device scaling limits of Si MOSFETs and their application dependencies, Proc. IEEE, vol. 89, no. 3, pp. 259288, Mar. 2001.
A. Asenov, A. R. Brown, J. H. Davies, S. Kaya, and G. Slavcheva,
Simulations of intrinsic parameter fluctuations in decananometer and
nanometer-scale MOSFETs, IEEE Trans. Electron Devices, vol. 50,
no. 9, pp. 18371852, Sep. 2003.
E. Seevinck, F. J. List, and J. Lohstroh, Static-noise margin analysis
of MOS SRAM cells, IEEE J. Solid-State Circuits, vol. 22, no. 5,
pp. 748754, Oct. 1987.
T. Sakurai, High-speed circuit design with scaled-down MOSFETs and
low supply voltage, in Proc. IEEE Int. Symp. Circuits Syst., May 1993,
pp. 14871490.
A. J. Bhavnagarrwala, X. Tang, and J. D. Meindl, The impact of intrinsic
device fluctuations on CMOS SRAM cell stability, IEEE J. Solid-State
Circuits, vol. 36, no. 4, pp. 658665, Apr. 2001.
L. Chang, Y.-K. Choi, D. Ha, P. Ranade, S. Xiong, J. Bokor, C. Hu, and
T.-J. King, Extremely scaled silicon nano-CMOS devices, Proc. IEEE,
vol. 91, no. 11, pp. 18601873, Nov. 2003.
M. Yang, E. P. Gusev, M. Ieong, O. Gluschenkov, D. C. Boyd, K. K. Chan,
P. M. Kozlowski, C. P. DEmic, R. M. Sicina, P. C. Jamison, and
A. I. Chou, Performance dependence of CMOS on silicon substrate
orientation for ultrathin oxynitride and HfO2 gate dielectrics, IEEE
Electron Devices Lett., vol. 24, no. 5, pp. 339341, May 2003.

Eiji Morifuji (M98A01M03) received the B.S.


and M.S. degrees in electrical engineering from the
University of Tokyo, Tokyo, Japan.
In 1995, he joined the Research and Development
Center, Toshiba Company, Kanagawa, Japan, where
he was been engaged in the research of advanced
CMOS and RF devices. He moved to Semiconductor
Company, Toshiba Company, and has been working
on the research and development of CMOS system
large-scale integration (LSI). From 2005 to 2006,
he spent one year at Stanford University, where he
worked for the scaling and low-power optimization of CMOS. His current
research includes CMOS system LSI, yield management, and analog CMOS.

Dinesh Patil (S01) received the B.Tech. degree in


electrical engineering from the Indian Institute of
Technology, Mumbai, India, in 2001 and the M.S.
degree in electrical engineering from Stanford University, Stanford, CA, in 2004. He is currently working toward the Ph.D. degree in electrical engineering
at Stanford University.
He worked at the IBM T.J. Watson Research Center, Yorktown Heights, NY, in the summer of 2003,
on effects of variations digital circuit performance
and at Barcelona design, in 2004, on designing a
computer-aided design tool for the design of robust circuits. He is currently
with the Center for Integrated Systems, Department of Electrical Engineering,
Stanford University. His research interests include energy-efficient digital circuit design, circuit-optimization techniques, and device optimization.
Mr. Patil is the recipient of the Best Paper Award at ISQED 2005.

722

Mark Horowitz (S77M78SM95F00) received the B.S. and M.S. degrees in electrical engineering from Massachusetts Institute of Technology,
Cambridge, MA, in 1978 and the Ph.D. degree from
Stanford University, Stanford, CA, in 1984.
In 1990, he took leave from Stanford to help start
Rambus Inc., a company designing high-bandwidthmemory interface technology. He is currently the
Yahoo Founders Professor of the School of Engineering, Stanford University. His research area is
in digital-system design, and he has led a number
of processor designs including Microprocessor without Interlocked Pipeline
Stages (MIPS-X), one of the first processors to include an on-chip instruction
cache, TORCH, a statically scheduled superscalar processor that supported
speculative execution, and FLASH, a flexible Distributed Shared Memory
(DSM) machine. He has also worked in a number of other chip-design areas including high-speed and low-power memory design, high-bandwidth interfaces,
and fast floating point. His current research includes multiprocessor design,
low-power circuits, high-speed links, and new graphical interfaces.
Dr. Horowitz is the recipient of the 1985 Presidential Young Investigator
Award, the 1993 ISSCC Best Paper Award, and the ISCA 2004 Most Influential
Paper of 1989, and the 2006 winner of the IEEE Donald Pederson Award
in Solid State Circuits. He is a Fellow of the Association for Computing
Machinery.

IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 54, NO. 4, APRIL 2007

Yoshio Nishi (SM82F88) received the B.S. degree


in material science from Waseda University, Tokyo,
Japan, and the Ph.D. degree in electronics engineering from the University of Tokyo, Tokyo, Japan,
respectively.
He joined Toshiba R&D in the areas of research for semiconductor-device physics and interfaces mostly in silicon, resulting in the discovery of
the ESR PB Center at SiO2Si interface, the first
256-b MNOS nonvolatile RAM, SOS 16-b microprocessor, and the worlds first 1-Mb CMOS DRAM.
In 1986, he moved to Hewlett-Packard as the Director of the Silicon Process
Laboratory, followed by establishing ULSI Research Laboratory as the Founding Director. In 1995, he joined Texas Instruments Incorporated, as the Senior
Vice President and Director of research and development for a semiconductor
group, and implemented a new R&D model for silicon-technology development, followed by establishing Kilby Center. Since May 2002, he became a
faculty member of Stanford University, Stanford, CA, and his research interest
covers nanoelectronic devices and materials including metal gate/high-k MOS,
device layer transfer for 3-D integration, nanowire devices, and resistancechange nonvolatile memory materials and devices. Since May 2002, he has been
a Professor with the Department of Electrical Engineering (research) and also
with the Department of Material Science and Engineering, Stanford University.
He also serves as Director with the Stanford Nanofabrication Facility of
National Nanotechnology Infrastructure Network, and Director of Research
with the Center for Integrated Systems. He published more than 200 papers
including conference proceedings and coauthored/edited nine books. He is the
holder of more than 70 patents in the U.S. and Japan.
Dr. Nishi served the Semiconductor Research Corporation and International
Sematech as a board member, NNI Panel, MARCO Governing Council, etc., in
the period of 19952002. Currently, he serves for Science Council of Japan as
an affiliated member. He is a member of Japan Society of Applied Physics and
the Electrochemical Society. He is the recipient of recent awards, which include
the 1995 IEEE Jack Morton Award and 2002 IEEE Robert Noyce Medal.

You might also like