Professional Documents
Culture Documents
Correspondence
I. BACKGROUND
#O #1 #2 #n-1
The need for buffers at chip-crossing boundaries of MOS IC's
has been highlighted by Weste and Eshraghian [ l ] as well as
Mead and Conway [2]. The wherewithal for the design of such
buffers has been scrutinized by Lin and Linholm (31, Jaeger [4],
Veendrick [5], Hedenstierna and Jeppson [6], Nemes [7], and
Kanuma [8]. Several topics, which bear upon approximations
employed in buffer design, have been discussed by Greenbaum
[9], as well as k n o u t and De Man [lo]. Improvements attain-
Fig. 2. Model implied in Jaeger's paper
able by recourse to BiCMOS have been examined by Rosseel
and Dutton [ll], as well as De Los Santos and Hoefflinger [12].
A severe mismatch between off-chip loads and on-chip logic and the logic-level time constant 7, = C, / g . The taper is p, i.e.,
devices prevails in high-density CMOS circuits. In the interest of +
the W / L ratio of stage # ( k 1) is /3 times larger than that of
speed and power considerations, MOS transistors are laid out to itage # k :
minimal geometries $and W / L ratios close to 1. With gate
oxides of about 250 A, the on-chip capacitance of logic devices (w/L)k +I = p ( W / L ) k . (1)
amounts to several tens of femtofarads against an off-chip load
capacitance of 50 p F or more. Thus, a speed degradation factor The conductance, capacitance, and time constant of stage # k
of three orders of magnitude would result, if the loads were are
connected directly to logic-level transistors. Naturally then,
guided by past practice, one inserts a tapered buffer between
the logic devices and the load.
11. DESIGN
OF THE TAPERED
BUFFER
The overall time constant of the buffer (7,) is assumed to be
We begin with the Jaeger version of the Lin-Linholm ap- equal to the sum of the time constants of the individual stages:
proach, and, then proceed to the split-capacitor modification
n-1
developed by us. In Jaeger's model, each stage of the buffer is
represented by one conductor and one capacitor. We use one 7, = (Tk)=npT,. (3)
k=O
conductor but two capacitors. The thrust of our discussion is
directed at the optimization of the dynamic response of the The load capacitance at the output stage (C,) is
buffer.
Jaeger's buffer and its model are shown in Figs. 1 and 2, CL = pnc,. (4)
respectively. There are n stages, numbered 0 to n -1. The
logic-level capacitance is C,, the logic-level conductance is g , The number of stages of the buffer can, therefore, be written as
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY HAMIRPUR. Downloaded on January 19, 2010 at 09:13 from IEEE Xplore. Restrictions apply.
1006 IFEE JOIJRNAL O F SOLID-STATE CIRCUITS, VOL. 25, NO. 4, AUGUST 1990
where Zp is the peak short-circuit current of the inverter, while IV. THEFAN-OUTDECISION
T, and T~ stand for rise and fall times, respectively. See [5] for In general layout work, of special interest are nodes with low
background to (10). to moderate fan-out. Faced with a fan-out of k , d o we or don't
The new definitions read as follows. The logic-level time we use a buffer? Obviously enough, where this question arises,
constant is reference is made to a single-stage buffer, scaled as shown in
Fig. 5(b). That buffer is to be compared with the straight
T, = ( c ,+ C , ) / g inverter in Fig. 5(a).
and the time constant of stage #k is Retaining the technique of linear addition of time constants,
we write the overall delay in Fig. 5(b) as
p k c , + P'k I)cl
+
r/, = (11)
PhR,
-
- c, + c, + ( P - l > C ,
(12)
Sn,
Scrutinizing (19) for best P , one arrives at
= [1 + ( P - l ) P l T , (13)
where
C, f n confirmation of the uniform taper approach. The total delay
a=-. (14)
c, + c, 1s
The total delay through the buffer is .r,(min) = 2(C, +JkC,)/g (21)
7 , = 11Tk (15)
which is to be compared with the "no buffer" delay
where
In ( c, / c,) T:=(Cx+kCy)/g.
n= (16)
InP '
The former is smaller than the latter when
Substituting now (13) and (16) into (15), we get
2( C, + JkC,) < C, + kC,
[I+ ( P - 1)Pl
T , = T , .In ( C, / C y ) .
In P (17) that is when
4
-1 0.0
K
v)
v
m" 7.5
-
a,
n
-
& 5.0
3
m
2.5
0
OL 5 1'0 115 o:
(a)
h
10.0
c
v)
v
2 7.5
-
a
n
- 5.0
2.5
(c) (d)
Fig. 4. Taper versus C , / C , : (a) p = 2.72. (b) = 3.10, ( c ) p = 3.59, and (d) P = 4.32.
k,,(O) =4.
Othcnvise
Equation (26) and Table 111 reveal that the answer to the buffer
question depends on both C,/C,, and C,/C,. As a rule, a
buffer should be used only when C , /C, is larger than four.
(a) V. CONCLUSION
VI rt
$?lcx p&c.x.$,
C L= kCy
The split-capacitor model leads to the conclusion that the
taper is a function of C, / C , and, therefore, a matter of
technology, i.e., it depends on fcature size, gate-oxide thickness,
junction capacitanccs, etc. For any particular load capacitance,
there exists a best taper and a corresponding best number of
stages, but the law relating the delay penalty to the taper of the
buffer is not very strong. At on-chip distribution points, buffers
are justified only where fan-out cxceeds a factor of 4. However,
the chip-crossing penalty of MOS implementations is severe
(b) even in the best case; therein lies a strong argument in favor of
Fig. 5 . The fan-out decision: (a) direct hookup and (b) buffered connection. BiCMOS I/O.
1008 I E E E JOURNAL OF SOLID-STATE C i n c u i T s , VOL. 25, NO. 4, AUGUST 1990
11. CIRCUIT
DESIGNOF A D-TYPEDET-FF
Abstmct -A CMOS implementation of a D-type double-edge-
triggered flip-flop (DET-FF) is presented. A DET-FF changes its state at A D-type DET-FF consists of two cross-coupled latches with
both the positive and the negative clock edge transitions. It has advan- input gating devices and some simple pass-transistor logic. A
tages with respect to both system speed and power dissipation. The circuit diagram is illustrated in Fig. 1. Its operation principle is
design presented requires little overhead in circuit complexity. This
CMOS D-type DET-FF is capable of operating at more than 50 MHz, similar to the one used by Mead and Wawrzynek [4]. The two
which gives an equivalent system frequency of 100 MHz. cross-coupled latches are enabled/disabled by the clock signal.
When the clock is low, latch 1 is disabled and latch 2 is enabled.
I. INTRODUCTION With clock high, latch 1 is enabled and latch 2 disabled. A
Conventional single-edge-triggered flip-flops (SET-FF’s) disabled latch 1 has both of its output and the complement set
change states at the time when the clock signal goes from 0 to 1 to high (Vdd).A disabled latch 2 has both its output and the
or at the time when the clock goes from 1 to 0. The former are complement set to low (GND). During the rising edge of the
called positive-edge-triggered flip-flops (PET-FF’s) or rising- clock signal, latch 1 is being enabled. Depending on the D input
edge-triggered flip-flops (RET-FF’s) and the latter are called value, either transistor M 7 or M 8 is conducting just before M 9
negative-edge-triggered flip-flops (NET-FF’s) or trailing-edge switches off. Either output Ql or its complement will remain
triggered flip-flops (TET-FF’s). The advantage of edge trigger- charged to high (&,) while the other is discharged to low
ing is that the setup time for data input is independent of the (GND). The set value will stay unchanged throughout the half of
clock pulse width. This makes system design simpler. It is also the clock period while it is high. Similarly, on the trailing edge
less sensitive to noises. However, these flip-flops respond only of the clock signal latch 2 is being enabled. According to the
once per clock pulse cycle. Energy and time are wasted. Unger value of the D input, either Q 2 or its complement will remain
proposed in [l]a class of flip-flops (FF’s) that will respond to low (GND) while the other will be set to high. The output value
both the positive and the negative edge of the clock pulse. These remains stable for the duration of low clock signal. Thus, this
DET-FF is a static flip-flop. It consumes no static power. Table
I gives the logic required to obtain the final output value. We
Manuscript received November 14, 1989; revised December 18, 1989. observed that when the clock is low, both Q l and its comple-
S:L. Lu is with the Department of Computer Science, University of
California, Los Angeles, CA 90024 and with MOSIS, Marina del Rey, CA ment are high. The final value is the value of Q2. When clock is
90292-6695. high, both Q2 and its complement are low. The final value is the
M. Ercegovac is with the Department of Computer Science, University of value of Ql. Pass-transistor logic, shown in Fig. l(c), is used to
California, Los Angeles, CA 90024.
IEEE Log Number 9036484. implement the logic function.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY HAMIRPUR. Downloaded on January 19, 2010 at 09:13 from IEEE Xplore. Restrictions apply.