Professional Documents
Culture Documents
4, APRIL 2007
715
I. INTRODUCTION
716
(2)
(5)
as derived in [16].
Off current in a single MOSFET (Io ), as a function of Vth ,
Wg , and Lg , is expressed as
kT
Wg
Io = e Cox
(m 1)
(6)
eqVth /mkT .
Lg
q
The total Io is a sum over all the SRAM cells. With
Gaussian Vth , Io has a lognormal distribution. Assuming
weak correlation between neighboring cells, we can estimate
the average leakage by summing all the lognormal distributions and taking the average. The Vth mismatch obtained from
(5) is used in order to correctly represent the off current in the
chip. The probability-density function of Io is given as
(log(Ioff ))2
1
2
2
10
f (Io ) =
x 2
where and are the mean and standard deviation, respectively, of log(Io ), which is proportional to the random variable
Vth . The average off current E(Io ) is then given by
E(Io ) = 10(+
/2)
(7)
717
Fig. 1. Configuration of 6-Tr SRAM and SRAM layout studied in this paper. Cell area for SRAM can be expressed as a function of Lg and Wg for each Tr and
design rules.
Fig. 2. Setup for investigating the worst transfer leakage in SRAM. Shown in
the right-hand side indicates the leakage and bit-line voltage as a function of
number of bits having 1 state in a single bit line. Leakage is dependent on the
ratio between the number of nodes having 1 and 0 states.
most. The relationship between the SNM variation and transistor mismatch is taken into account by defining the worst
noise margin as the lower 6 value of SNM in the Gaussian
distribution.
Cell current at read cycle is estimated by calculating the
linear draincurrent for driver transistor with Vgs = Vdd and
Vds = VOL. Four constraints related to the SRAM operation
are used in the optimization.
1) Bit leakage: In order to guarantee read operation, we
ensure that the total bit-line leakage in the case when
512 b are connected together to a single bit line must be
less than the cell current with by at least two orders of
magnitude.
2) Noise margin: We constraint the worst SNM to be larger
than 10% of the supply voltage, which is the standard tolerance required for the fluctuation of the supply-voltage
line.
3) Read speed: Read speed is ensured by constraining the
cell current normalized to the bit-line capacitance (since
the capacitance will change with the cell area) to be
greater than the one corresponding to a 300-MHz operation. We also include the effect of changing bit capacitance on the dynamic power dissipation.
4) Write ability: We make sure that the optimizer uses the
correct device strengths for write ability by ensuring that
saturation current for transfer at the slow corner is larger
than that for the load transistor at the fast corner.
718
Fig. 4. Optimized power for SRAM as a function of the cell area in 65-nm
technology node.
Fig. 6. Standby power is broken down into each component. As cell area is
increased, the gate and junction leakages become dominant.
Fig. 7. Optimized power for SRAM as a function of cell area and read speed
in the 65-nm technology node.
Fig. 5. Optimized variables to achieve minimum power plotted by the
cell area. (a) Gate width. (b) Gate length. (c) Supply voltage. (d) Threshold
voltage.
D. Assumptions on Scaling
For scaling, we assume a 70% shrink of all design rules
for each technology generation. The optimizer is set to keep a
constant read speed of 300 MHz with scaling. Power consumed
with the read-and-write cycle, only in a bit cell, is considered.
We assume constant wiring capacitance per length for a minimum width metal line, as we scale technology.
III. RESULTS AND DISCUSSIONS
Fig. 4 shows the optimized power plotted against the bitcell area in 65-nm technology node. There is a clear tradeoff
between the cell size and power. Fig. 5 shows the optimum
values for Wg , Lg , Vdd , and Vth as a function of cell area.
Optimum Vdd is 1.11.3 V for all cell areas. In most memory
macros, except for very small memories like register files and
local cache subbanks, standby power dominates the total power
due much lower switching activity. Optimizer increases the gate
width for the driver and threshold voltage for all the transistors
as the cell size is increased. Thus, appropriate cell read current
is ensured by increasing the width of the driver transistor as
Vth is increased to suppress subthreshold leakage. Lg is also
increased to suppress the mismatch factors as cell area is
increased. The upper limits of Vth for the driver and transfer
transistors are determined by the criteria of cell current. The
upper limit of Vth for load transistor is constrained by the SNM,
which degrades with the increase of Vth because of mismatch.
The lower limit of Lg for all transistors is constrained by the
SNM, which is mainly constrained by mismatch. A minimum
width is selected for the load transistor because of the writeability constraint. The width of the transfer transistor is set by
the cell current and SNM constraints. Fig. 6 summarizes the
breakdown of the optimized standby power as a function of cell
area. In smaller cells, off leakage dominates because Vth must
be reduced to meet cell-current constraint with limited gate
width. As the cell size and gate width is increased, Vth can be
increased, so gate leakage and junction leakage dominate. This
corresponds to the actual power obtained in the 65-nm technology case provided by [9]. Fig. 7 shows the optimized power
Fig. 8.
719
Optimized variables for minimizing standby power for SRAM from 65- to 22-nm technology nodes. (a) Supply voltage. (b) Gate length. (c) Gate width.
Fig. 10. Breakthrough items for SRAM scaling and estimated benefit in power
in 22-nm technology node.
Fig. 9.
Scaling trend of the cell area and keeping the same power per bit.
720
Fig. 11. Breakdown of the bit cell in the lateral and vertical directions.
Fig. 12. Obtained scaling roadmap for SRAM. Epitaxial channel to form 3-D channel structure is beneficial for the area scaling and the promising candidate.
High k will greatly help cells which have been optimized for power but will have a small effect to those with high-density memories where subthreshold leakage
dominates.
721
R EFERENCES
[1] The International Technology Roadmap for Semiconductors, 2004.
[2] P. Bai, C. Auth, S. Balakrishnan, M. Bost, R. Brain, V. Chikarmane,
R. Heussner, M. Hussein, J. Hwang, D. Ingerly, R. James, J. Jeong,
C. Kenyon, E. Lee, S.-H. Lee, N. Lindert, M. Liu, Z. Ma, T. Marieb,
A. Murthy, R. Nagisetty, S. Natarajan, J. Neirynck, A. Ott, C. Parker,
J. Sebastian, R. Shaheed, S. Sivakumar, J. Steigerwald, S. Tyagi,
C. Weber, B. Woolery, A. Yeoh, K. Zhang, and M. Bohr, A 65 nm logic
technology featuring 35 nm gate lengths, enhanced channel strain, 8 Cu
interconnect layers, low-k ILD and 0.57 m2 SRAM cell, in IEDM Tech.
Dig, 2004, pp. 657660.
[3] Z. Luo, A. Steegen, M. Eller, M. Mann, C. Baiocco, P. Nguyen, L. Kim,
M. Hoinkis, V. Ku, V. Klee, F. Jamin, P. Wrschka, P. Shafer, W. Lin,
S. Fang, A. Ajmera, W. Tan, D. Park, R. Mo, J. Lian, D. Vietzke,
C. Coppock, A. Vayshenker, T. Hook, V. Chan, K. Kim, A. Cowley,
S. Kim, E. Kaltalioglu, B. Zhang, S. Marokkey, Y. Lin, K. Lee, H. Zhu,
M. Weybright, R. Rengarajan, J. Ku, T. Schiml, J. Sudijono, I. Yang,
and C. Wann, High performance and low power transistors integrated in
65 nm bulk CMOS technology, in IEDM Tech. Dig, 2004, pp. 661664.
[4] A. Chatterjee, J. Yoon, S. Zhao, S. Tang, K. Sadra, S. Crank, H. Mogul,
R. Aggarwal, B. Chatterjee, S. Lytle, C. T. Lin, K. D. Lee, J. Kim,
Q. Z. Hong, T. Kim, L. Olsen, M. Quevedo-Lopez, K. Kirmse, G. Zhang,
C. Meek, D. Aldrich, H. Mair, M. Mehrotra, L. Adam, D. Mosher,
J. Y. Yang, D. Crenshaw, B. Williams, J. Jacobs, M. Jain, J. Rosal,
T. Houston, J. Wu, N. S. Nagaraj, D. Scott, S. Ashburn, and A. Tsao,
A 65 nm CMOS technology for mobile and digital signal processing
applications, in IEDM Tech. Dig., 2004, pp. 665668.
[5] E. Morifuji, T. Yoshida, H. Tsuno, Y. Kikuchi, S. Matsuda, S. Yamada,
T. Noguchi, and M. Kakumu, New guideline of Vdd and Vth scaling for
65 nm technology and beyond, in Proc. Symp. VLSI Dig. Tech. Papers,
2004, pp. 164165.
[6] N. Yanagiya, S. Matsuda, S. Inaba, M. Takayanagi, I. Mizushima,
K. Ohuchi, K. Okano, K. Takahasi, E. Morifuji, M. Kanda,
Y. Matsubara, M. Habu, M. Nishigoori, K. Honda, H. Tsuno,
K. Yasumoto, T. Yamamoto, K. Hiyama, K. Kokubun, T. Suzuki,
J. Yoshikawa, T. Sakurai, T. Ishizuka, Y. Shoda, M. Moriuchi, M. Kishida,
H. Matsumori, H. Harakawa, H. Oyamatsu, N. Nagashima, S. Yamada,
T. Noguchi, H. Okamoto, and M. Kakumu, 65 nm CMOS technology
(CMOS5) with high density embedded memories for broadband microprocessor applications, in IEDM Tech. Dig., 2002, pp. 5760.
[7] M. Kanda, E. Morifuji, M. Nishigoori, Y. Fujimoto, M. Uematsu,
K. Takahashi, H. Tsuno, K. Okano, S. Matsuda, H. Oyamatsu,
H. Takahashi, N. Nagashima, S. Yamada, T. Noguchi, Y. Okamoto, and
M. Kakumu, Highly stable 65 nm node (CMOS5) 0.56 m2 SRAM cell
design for very low operation voltage, in Proc. Symp. VLSI Dig. Tech.
Papers, 2003, pp. 1314.
[8] E. Morifuji, M. Kanda, N. Yanagiya, S. Matsuda, S. Inaba, K. Okano,
K. Takahashi, M. Nishigori, H. Tsuno, T. Yamamoto, K. Hiyama,
M. Takayanagi, H. Oyamatsu, S. Yamada, T. Noguchi, and M. Kakumu,
High performance 30 nm bulk CMOS for 65 nm technology node
(CMOS5), in IEDM Tech. Dig., 2002, pp. 655658.
[9] K. Utsumi, E. Morifuji, M. Kanda, S. Aota, T. Yoshida, K. Honda,
Y. Matsubara, S. Yamada, and F. Matsuoka, A 65 nm low power CMOS
platform with 0.495 m2 SRAM for digital processing and mobile applications, in Proc. Symp. VLSI Dig. Tech. Papers, 2005, pp. 216217.
[10] Y. Taur and T. H. Ning, Fundamentals of Modern VLSI Devices.
Cambridge, U.K.: Cambridge Univ. Press, 1998.
[11] K. Chen, C. Hu, P. Fang, M. R. Lin, and D. L. Wollesen, Predicting
CMOS speed with gate oxide and voltage scaling and interconnect loading
effects, IEEE Trans. Electron Devices, vol. 44, no. 11, pp. 19511957,
Nov. 1997.
[12] H. Yoshimura, T. Nakayama, M. Nishigohri, M. Inohara, K. Miyashita,
E. Morifuji, A. Oishi, H. Kawashima, M. Habu, H. Koike, H. Takato,
Y. Toyoshima, and H. Ishiuchi, A CMOS technology platform for
0.13 m generation SOC (system on a chip), in Proc. Symp. VLSI Dig.
Tech. Papers, 2000, pp. 144145.
[13] A. Oishi, R. Hasumi, Y. Okayama, K. Miyashita, M. Oowada, S. Aota,
T. Nakayama, M. Matsumoto, N. Inada, T. Hiraoka, H. Yoshimura,
Y. Asahi, Y. Takegawa, T. Yoshida, K. Sunouchi, A. Yasumoto,
Y. Tateshita, M. Ueshima, T. Morikawa, T. Umebayashi, T. Gocho,
F. Matsuoka, T. Noguchi, and M. Kakumu, MOSFET design of
100 nm node low standby power CMOS technology compatible with
embedded trench DRAM and analog devices, in IEDM Tech. Dig., 2001,
pp. 507510.
[14] M. Iwai, A. Oishi, T. Sanuki, Y. Takegawa, T. Komoda, Y. Morimasa,
K. Ishimaru, M. Takayanagi, K. Eguchi, D. Matsushita, K. Muraoka,
[15]
[16]
[17]
[18]
[19]
[20]
[21]
722
Mark Horowitz (S77M78SM95F00) received the B.S. and M.S. degrees in electrical engineering from Massachusetts Institute of Technology,
Cambridge, MA, in 1978 and the Ph.D. degree from
Stanford University, Stanford, CA, in 1984.
In 1990, he took leave from Stanford to help start
Rambus Inc., a company designing high-bandwidthmemory interface technology. He is currently the
Yahoo Founders Professor of the School of Engineering, Stanford University. His research area is
in digital-system design, and he has led a number
of processor designs including Microprocessor without Interlocked Pipeline
Stages (MIPS-X), one of the first processors to include an on-chip instruction
cache, TORCH, a statically scheduled superscalar processor that supported
speculative execution, and FLASH, a flexible Distributed Shared Memory
(DSM) machine. He has also worked in a number of other chip-design areas including high-speed and low-power memory design, high-bandwidth interfaces,
and fast floating point. His current research includes multiprocessor design,
low-power circuits, high-speed links, and new graphical interfaces.
Dr. Horowitz is the recipient of the 1985 Presidential Young Investigator
Award, the 1993 ISSCC Best Paper Award, and the ISCA 2004 Most Influential
Paper of 1989, and the 2006 winner of the IEEE Donald Pederson Award
in Solid State Circuits. He is a Fellow of the Association for Computing
Machinery.