MCML

40 IEEE CIRCUITS AND SYSTEMS MAGAZINE 1531-6364/06/$20.
002006 IEEE FOURTH QUARTER 2006

In the recent years, MOS Current-
Mode Logic (MCML) circuits have been
gaining a remarkable interest in sever-
al VLSI applications, ranging from
high-accuracy mixed-signal circuits to
high-speed circuits for channel
(de)multiplexing in optic fiber and
Radio Frequency (RF) telecommunica-
tion systems. However, advantages
over traditional CMOS logic are
achieved at the cost of a static power
consumption, which must be kept as
low as possible. Accordingly, a con-
scious management of the power-
delay trade-off is essential in the
design of such circuits.
This paper presents several recent
ideas on the design of digital MCML
circuits organized in a comprehensive
framework. The treatment reviews and
extends previous results by incorpo-
rating Deep-Sub-Micron (DSM) effects
from the beginning, with a strongly
simplified analytical formulation to
improve the understanding and the
design. Interesting properties and
design criteria are derived from sim-
ple analytical models. From these
models, a deep insight into the design
of MCML circuits is gained, which is
essential for both the efficient design
of MCML cells and the development of
an automated design flow. Numerical
examples are presented by consider-
ing a 90-nm CMOS process.
Massimo Alioto and Gaetano Palumbo
Abstract
M
A
S
T
E
R

S
E
R
I
E
S
Feature
I. Introduction
I
n the last decade, we have witnessed an increasing
interest in MOS Current-Mode Logic (also named
Source-Coupled LogicSCL) circuits, which repre-
sents an alternative to traditional CMOS logic styles in
several applications. Despite of their recent adoption,
MCML circuits actually have quite old ancestors in their
family tree, as they directly descend from the bipolar
Current-Model Logic (CML) which has the same topology,
despite of the different adopted technology [1].
The fundamental structure of an n-input MCML gate is
depicted in Figure 1, where an NMOS network (consisting
of properly stacked source-coupled pairs) steers the bias
current I
SS
to one of the two output branches, according
to the value of the differential inputs v
i1=
v
i1,1
v
i1,2
, . . .
v
in=
v
in,1
v
in,2
. The steered current is then converted
into a differential output voltage v
o=
v
o,1
v
o,2
by the two
resistances R
D
(in red line) which can be often imple-
mented by physical resistors, or alternatively by PMOS
transistors (working in the triode region) active load. As
opposite to previous works dealing with the power-delay
trade-off management in MCML
gates [2][5], in the following a
physical resistor will be assumed.
The current source I
SS
in Figure 1
is usually implemented by a simple
current mirror, which is not shown
for the sake of simplicity. The load
capacitance C
L
represents the
external capacitance due to the
input capacitance of the following
gates and the wiring capacitance.
The general topology in Figure 1
allows the implementation of both
combinational and sequential gates
whose logic function only depends
on the connection of the source-
coupled pairs. The implemented
function can also be modified by
negating the inputs and the output,
i.e., by simply swapping the corre-
sponding pairs of differential sig-
nals. As an example of the simplest
logic gate, the topology of an MCML
inverter is depicted in Figure 2,
where the NMOS network consists
of only one source-coupled pair. As
other examples, the NMOS network
topology of a 2-input Multiplexer
41
Massimo Alioto is with the DII (Dipartimento di Ingegneria dellInformazione), UNIVERSIT DI SIENA, v. Roma, 56, I-53100 SIENA ITALY,
E-mail: malioto@dii.unisi.it. Gaetano Palumbo is with the DIEES (Dipartimento di Ingegneria Elettrica Elettronica e dei Sistemi), UNIVERSIT
DI CATANIA, Viale Andrea Doria 6, I-95125 CATANIA ITALY, E-mail: gpalumbo@diees.unict.it
FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE
V
DD
v
o,1
v
o,2
C
L
R
D
R
D
I
SS
NMOS Source-
Coupled Pairs
Network
v
i1,1
...
v
i1,2
...
C
L
v
i2,1 v
i2,2
v
in,1
v
in,2
Figure 1. Topology of
a generic MCML gate.
NMOS Source-
Coupled Pairs
Network
M1 M2
v
i,1 v
i,2
V
DD
v
o,1
v
o,2
C
L
R
D
R
D
I
SS
C
L
M1 M2
v
i,1
v
i,2
Figure 2. Topology of
a MCML inverter gate.
(MUX), XOR and D-latch gate are shown in Figures 35,
respectively, (for an overview on the design techniques to
derive the topology of arbitrary MCML gates, the reader
is directed to [2]). Their static operation is easily under-
stood by considering the simplest case of the inverter
gate in Figure 2: when the input voltage v
i=
v
i,1
v
i,2
is
high (low), the source-coupled pair M1-M2 completely
steers the current I
SS
to the drain of M1 (M2), thus the
output voltage is equal to the low (high) value
V
OL=
(V
DD
R
D
I
SS
) V
DD
= R
D
I
SS
(V
OH =
R
D
I
SS
).
The obtained logic swing is
V
SWING
= V
OH
V
OL
= 2R
D
I
SS
(1)
which is rather small, typically in the order of a few hun-
dreds of millivolts. Due to the symmetry of the I-V trans-
fer characteristics of the source-coupled pair and of the
circuit, the logic threshold V
LT
is equal to zero (i.e.,
v
o
= 0 when v
i
= 0).
Unfortunately, the power dissipated by the MCML gate
is dominated by the static power consumption V
DD
I
SS
due to the bias current source since the dynamic contri-
bution (associated with the capacitance charge during
the gate switching) is rather small due to the reduced
logic swing. For this reason, various techniques have
been adopted to dynamically reduce the static power
consumption [1]. The static power consumption is the
fundamental weak point of MCML gates, thus in their
design it must be kept as small as possible for a given
required performance by consciously managing the
power-delay trade-off, both to efficiently design MCML
cells and develop an automated design flow. In the fol-
lowing sections, power-aware design strategies will be
derived to address this problem.
Compared to traditional CMOS logic, MCML gates
exhibit various interesting features that make them suit-
able for an increasingly wide range of applications:
1) MCML gates are faster. The higher speed allows for
implementing circuits for fast communication sys-
tems (e.g., multiplexing/demultiplexing ICs in the
range of 10 Gb/s for SONET/SDH optic-fiber links
and high-speed crosspoint switches) and RF cir-
cuits (e.g., PLL, prescalers, circuits for clock
recovery and VCOs), as well as high-speed cur-
rent-mode buffers [6][13]. This speed improve-
ment is due to the tremendous CMOS technology
scaling, and allows for replacing previous bipolar
CML logic [14]. However, in Section V it will be
shown that the high speed performance is not due
to the small logic swing, as opposite to the com-
mon belief.
2) MCML gates have a better power efficiency at high
frequencies. This enforces the suitability of MCML
gates for high-frequency applications, since from
the last decade a low power consumption is also
required in high-speed circuits for reasons related
to the heat removal, as well as to the battery life-
time in portable devices [15], [16]. This has extend-
ed the range of applications of MCML gates to the
implementation of high-speed low-power arith-
metic and signal processing cores [15].
3) MCML gates generate a much lower switching
noise during switching. Indeed, the power supply
must provide a static power and thus a constant
current to each gate. This avoids the typical cur-
rent spikes of CMOS logic that determine large
voltage variations on the supply voltage V
DD
[17][22] which in turn couple with the eventual
analog circuits sharing the same substrate (as
occurs in current Systems-on-Chip) and degrade
their resolution. In particular, the almost con-
stant supply current leads to an almost zero volt-
age drop in the bonding
wires/supply rails inductance due
to current variations di/dt [6],
which will be increasingly impor-
tant in next technology nodes, in
which both this inductance and
the supply current variations are
expected to dramatically grow
because of the increased clock
frequency [23], [24].
The switching noise generated
by MCML circuits is typically
reduced by two orders of magni-
tude, thus this logic style is cur-
rently adopted in most high-speed
high-resolution mixed-signal ICs for
digital audio and video signal pro-
42 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2006
V
DD
v
o,1
v
o,2
C
L
R
D
I
SS
C
L
M1 M2 SEL
1
M3 M4 A
2
M5 M6
A
1
B
2
B
1
SEL
2
0
1
A
B
SEL
OUT
2
:
1

M
U
X
R
D
Figure 3. Topology of a 2:1 Multiplexer.
cessing (such as sigma-delta A/D and D/A convert-
ers) [25][32].
4) MCML gates have a better signal integrity and a lower
delay noise. This is due to the much lower supply
voltage noise (discussed in point 3) and the differ-
ential operation of MCML gates, which are insensi-
tive to common-mode signals, including the supply
noise. This greatly simplifies the design of the sup-
ply distribution network and reduces the size (and
area) of decoupling capacitors needed to ensure
low V
DD
variations.
Interestingly, MCML circuits can also be made
insensitive to the noise arising from the (capaci-
tive) coupling with other switching circuits. Indeed,
this coupling noise becomes a common-mode sig-
nal if the cells and the interconnects are carefully
designed with a symmetric layout. This also avoids
the delay variations due to the capacitive coupling
with other switching gates
(often named delay noise
[33]), which is a major source
of delay uncertainty in cur-
rent CMOS logic circuits.
5) MCML gates potentially have a
lower sensitivity to process,
supply and environmental
variations. Simple techniques
to significantly lower the
effect of process tolerances
have been developed for
MCML gates circuits [1], [16],
even though this aspect has
not completely understood
and is currently under inves-
tigation [34]. As an example,
the variation in the logic
threshold V
LT
due to process
tolerances determines an
uncertainty on the input
switching time (in which
v
i
= V
LT
) and thus on the
delay. However, the V
LT
varia-
tions in MCML gates are
mainly due to the mismatch
of source-coupled NMOS
transistors (or load resistanc-
es), whereas the CMOS varia-
tions are due to the poorer
mismatch between a PMOS
and an NMOS transistor [35].
Thus, a lower uncertainty in
V
LT
and in the delay is
expected in MCML circuits.
Due to the essentially static power consumption,
the chip temperature and the supply voltage also
tends to be constant (according to point 3), there-
by minimizing the delay variations associated with
supply and environmental variations.
The potentially lower delay uncertainty is an
appealing property since it is becoming an
increasing fraction of the clock period [36], and
thus a major limit to the speed improvement in
current Deep-Sub-Micron (DSM) technologies [33].
6) MCML gates suffer from a lower degradation of the
electrical transistor properties due to DSM effects.
This is due to the lower logic swing, which reduces
the voltages across the transistors terminals, and
thus the electric field under the transistor channel.
This reduces DSM effects, such as the carrier mobil-
ity degradation and velocity saturation, when com-
pared to standard CMOS logic.
43 FOURTH QUARTER 2006 IEEE CIRCUITS AND SYSTEMS MAGAZINE
V
DD
A
B
OUT
v
o,1 v
o,2
C
L
R
D
R
D
I
SS
C
L
M1 M2
M3 M4
A
2 M5 M6 A
1
A
2
B
2
B
1
Figure 4. Topology of a XOR gate.
V
DD
D
CLK
OUT
v
o,1 v
o,2
C
L
R
D
I
SS
C
L
M1 M2
M3 M4 A
2
M5 M6 A
1
D
Q
Q Clk
CLK
CLK
R
D
Figure 5. Topology of a D Latch.
Moreover, the worse speed performance of PMOS
transistors does not impose a speed limit in MCML
gates, since the switching source-coupled pairs
are made up of only NMOS transistors [7].
According to points 13, the range of applications in
which MCML gates exhibit significant advantages has
continuously broadened. This trend is expected to con-
tinue according to points 36, since MCML gates are less
sensitive than CMOS logic to limitations arising in sub-
100 nm technologies.
II. Static Analysis Through the Alpha-Power Law
The noise margin NMis the fundamental requirement on
the static behavior of any logic style, and for a single-
input gate (i.e., the inverter in Figure 2) it is defined as
V
OH,min
V
I H,min
(or equivalently as V
I L,max
V
OL,max
,
due to the symmetry of the transfer characteristics) from
the critical points of the DC transfer voltage characteris-
tics (V
I L,max
, V
OH,min
) and (V
I H,min
, V
OL,max
) in Figure 6.
In the more general case with multiple inputs, the
noise margin is evaluated from the DC characteristics
associated with a given input v
i
driving a source-coupled
pair M1-M2 with the other inputs being preliminarily
assigned [6]. The transistors driven by the latter con-
stant inputs do not affect the static behavior of the gate,
since they are switched off or can be assumed as short
circuited, thus the DC behavior of multiple gates is equal
to that of a simple inverter made up of the source-cou-
pled pair M1M2, according to Figure 7.
In general, the noise margin NM might depend on the
considered input v
i
, but in practical MCML gates all
source-coupled pairs are made identical to have the same
noise margin for all inputs, hence the noise margin
expression of the inverter is immediately extended to
arbitrary logic gates.
II-A. Evaluation of the Noise Margin:
A Simple Approach.
In this subsection, a novel simplified approach is
adopted to evaluate the noise margin of nanometer
MCML circuits by assuming a I-V relationship of MOS
transistors given by the well-known Alpha-Power law
R
D
I
SS
v
o
(v
i
)
v
i
V
ILmax
V
LT
= 0
0
V
IHmin
V
OHmin
R
D
I
SS
V
OLmax
Figure 6. Typical DC transfer characteristics of a MCML gate.
V
DD
v
o,1
R
D
M1 M2
v
i,1
Upper Source
Coupled Pairs
Lower Source
Coupled Pairs
I
SS
V
DD
v
o,2
I
SS
M1 M2
R
D
v
i,2
v
i,1
v
i,2
R
D
R
D
v
o,2
v
o,1
I
SS
Figure 7. Noise margin: equivalence of multiple-input gates to an inverter gate.
[37]. The latter expresses the NMOS drain current i
D
as
a function of the gate-source voltage v
GS
in the satura-
tion region
i
D
= K W (v
GS
V
TH
)
(2)
W being the effective channel width, V
TH
the transistor
threshold voltage, K and technology-dependent coeffi-
cients (channel length modulation effect was neglected, as
usual). In particular, old long-channel technologies have a
square I-V law, thus parameter is equal to 2 and K is
equal to
n
C
OX
/2L [6], where
n
is the electron mobility,
C
OX
is the gate oxide capacitance per unit area and L is
the effective channel length. In this case, the MCML noise
margin has been previously found to be given by, [2], [38],
NM =
V
SWING
2
1

A
V
(3)
where is a constant coefficient equal to
2, and A
V
is
the magnitude of the small-signal voltage gain around the
logic threshold given by, [2],
A
V
= g
m
R
D
(4)
g
m
being the transistor transconductance around the
logic threshold.
In the limit case of a very short-channel device with
a completely saturated carrier velocity, the I-V relation-
ship is linear, = 1, and K = v
sat
C
OX
, [6], [37]. In this
case, by performing the simple calculations reported in
Appendix I, the noise margin turns out to be still given
by (3) but with a different value of , which in this case
is equal to 1. In actual nanometer devices, as shown for
example by the data in Table 1 referring to a 90-nm
technology, is somewhat intermediate between 2 and
1, and thus is expected to range from
2 1.4 and 1.
As a reasonable approximation, can be set to the
intermediate value 1.2, which leads to
NM =
V
SWING
2
1
1.2
A
V
(5)
Extensive simulations were performed by varying the
logic swing from 240 mV to 800 mV, with A
V
ranging from
1.6 to 2.5, adopting a 90 nm technology whose main
parameters are reported in Table 1. The error of the ana-
lytical model (5) was found to be always lower than 14%
and typically in the order of a few percent. Typical values
of NM are in the order of 100 mV in current nanometer
technologies.
II-B. Considerations on the Technology Scaling
and Circuit Design.
From (5), the noise margin is proportional to the logic
swing, and roughly equal to half of it, if A
V
is suffi-
ciently high. Next, the comparison of (5) with (3)
shows that the noise margin achieved with nanometer
devices is greater than that with old long-channel tran-
sistors, for assigned values of the logic swing and the
voltage gain. This is a good news, since it means that
DSM effects are beneficial in terms of the noise margin
in MCML gates, and that the long-channel model in (3)
is pessimistic for current technologies. However, the
maximum logic swing which ensures the transistor
operation in the saturation region is equal to 2V
TH
[2],
which slightly decreases when scaling the technology,
thus the maximum noise margin tends to decrease
slowly.
Now, let us derive simple design equations to size R
D
and transistors in order to obtain assigned values of
V
SWING
and A
V
satisfying the noise margin requirement
(more detailed design guidelines to preliminarily assign
these two parameters will be discussed in Section V).
Solving (1), a given logic swing is achieved by properly
setting the resistance R
D
to V
SWING
/2 I
SS
, whereas from
(4) an assigned value of A
V
is achieved by setting the
NMOS transconductance g
m
in (6) to A
V
/R
D
g
m
=
di
D
dv
GS
i
D
=
I
SS
2
= (K W)
1
I
SS
2
1
1
(6)
where V
GS
under the drain current I
SS
/2 was evaluated
from (2). By substituting (6) into (4) and solving for W,
the transistor channel width needed to achieve a given
A
V
is
W =
2
21
K
A
V
V
SWING
I
SS
. (7)
From (7), the channel width of NMOS transistors must
be set to a value which is proportional to the bias current,
and increases proportionally to the ratio A
V
/V
SWING
. Of
1.45
K 0.83 E3 A/(mV
2
)
V
TH
0.35 V
W
min
120 nm
L
min
90 nm
effective L
min
65 nm
C
OX
18 fF/m
2
maximum V
DD
1 V
resistance per unit length r 1.23 kW/m
(unsilicided p+POLY)
capacitance per unit length c 0.07 fF/m
(unsilicided p+POLY)
Table 1.
Alpha-Power law coefficients and main process parameters
in a 90-nm technology.
course, it is set to the minimum value allowed by the
technology in the cases where (7) is lower than it.
III. Gate Delay Modeling Methodology
The delay in MCML gates can be evaluated by resorting to
the general approach in [2], [3], [37], where the circuit is
first properly linearized around the logic threshold and
eventually simplified by resorting to the half-circuit con-
cept by exploiting the symmetry. The linearized (half) cir-
cuit is then approximated to a first-order circuit with a
pole time constant and a zero time constant
z
(respec-
tively equal to the negative of the reciprocal of the pole
and the zero). The propagation delay
PD
of this first-
order approximation is equal to, [39],
PD
= 0.69 (
z
) (8)
where parameters and
z
can easily be evaluated by
applying the well-known open-circuit time-constant
method, [40], [41].
When linearizing the circuit, NMOS transistors cannot
be rigorously modeled with the well-known small-signal
MOS model in Figure 8 due to the strong non-linearity
involved in logic gates. However, it is well known [2] that
I
n the limit case of a very short-channel device with a completely saturated carrier velocity, i.e. with = 1 [38] and K = v
sat
C
OX
, [6],
the DC transfer characteristics of an MCML gate can easily be evaluated by solving the usual set of two equations encountered in the well-
known analysis of atraditional source-coupled pair, [46],
v
i
= v
GS1
v
GS2
(KVL at input loop)
i
D1
+i
D2
= I
SS
(KCL at the source node)
(A1.1)
By expressing v
GS
as a function of i
D
from (2) and substituting itinto the first equation in (A1.1), the solution of the set of two equations eas-
ily gives the expression of the transistor currents as a function of the input voltage v
i
i
D1
(v
i
) =
0 if v
i
<
I
SS
KW
I
SS
2
+K W
v
i
2
if |v
i
|
I
SS
KW
I
SS
if v
i
>
I
SS
KW
(A1.2a)
i
D2
(v
i
) = I
SS
i
D1
(v
i
) (A1.2b)
from which, considering that v
o1
= V
DD
R
D
i
D1
and v
o2
= V
DD
R
D
i
D2
, as well as substituting the voltage gain expression
A
V
= K W R
D
(achieved from (4), with g
m
equal to di
D
/dv
GS
= K W from (2) with = 1) and V
SWING
by solving (1), the differ-
ential output voltage is equal to
v
o
(v
i
) =
V
SWING
2
if v
i
<
V
SWING
2A
V
A
V
v
i
if |v
i
|
V
SWING
2A
V
V
SWING
2
if v
i
>
V
SWING
2A
V
(A1.3)
which according to Figure 20 is a piece-wise linear curve, as expected due to the linear I-V relationship. From this figure, the critical points
that define the noise margin are
(V
IL,max
, V
OH,min
) =
V
SWING
2A
V
,
V
SWING
2
(V
IH,min
, V
OL,max
) =
V
SWING
2A
V
,
V
SWING
2
(A1.4)
Thus the noise margin is equal to
NM = V
OH,min
V
IH,min
=
V
SWING
2
1
1
A
V
. (A1.5)
APPENDIX I
the same topology in Figure 8 can
be used to model NMOS transistors
since they work in the saturation
region most of the time, even
though the linearized parameters
(i.e., the transistor transconduc-
tance and capacitances) must be
evaluated in a large-signal condi-
tion to account for the wide voltage
variations during a switching tran-
sient. The large-signal transconduc-
tance G
M
can be evaluated as the
ratio of the drain current variation
i
D
and the gate-source voltage variation v
GS
during
the gate switching (in place of the small-signal transcon-
ductance di
D
/dv
GS
). In a complete switching the transis-
tor, the current changes from 0 to I
SS
or vice versa (i.e.,
i
D
= I
SS
), and this change is determined by a gate-
source voltage from V
TH
to [V
TH
+( I
SS
/KW )
1/
] by
solving (2) (thus v
GS
= ( I
SS
/KW)
1/
), hence G
M
is
equal to
G
M
=
i
D
v
GS
=
I
SS
( I
SS
/KW)
1
=
g
m

2
1
1

g
m
0.6 +0.4
(9)
where the small-signal transconductance g
m
around the
logic threshold (6) was substituted. In (9), G
M
is lower
than g
m
by a factor equal to /2
(11/)
, which is very well
approximated by (0.6 +0.4.) with an error smaller than
1% for ranging from 1 to 2. For scaled processes having
the values of closer to unity, the large-signal transcon-
ductance is only slightly lower than the small-signal
value.
In Figure 8, the source-bulk and drain-bulk capaci-
tances C
sb
and C
db
can be linearized by multiplying their
zero-bias value by a factor which depends on the junction
built-in potential, the grading coefficient and the mini-
mum/maximum direct voltage across the junction, [2],
[6]. The gate-drain and the gate-source capacitances C
gd
and C
gs
in the saturation region are approximately linear,
thus no linearization must be performed. It is worth not-
ing that all these NMOS parasitic capacitances are pro-
portional to the channel width W.
To model the load resistance R
D
, observe that it is
actually implemented by a strip of a highly-resistive layer
(to reduce its area occupation) with length L according to
Figure 9, which also has a distributed parasitic capaci-
tance to ground, with an overall value C
R,TOT
. By follow-
ing the analysis in Appendix II, this RC strip can be
represented by a lumped RC circuit consisting of the
resistance R
D
with a parallel capacitance C
RD
, as shown
in Figure 9. According to Appendix II, the capacitance
C
RD
is equal to one third of the total parasitic capaci-
tance, and is also proportional to 1/ I
SS
C
RD
=
C
R,TOT
3
=
C
R,unit
I
SS
(10a)
C
R,unit
=
V
SWING
6
c
r
(10b)
C
R,unit
being the load parasitic capacitance for a unit
bias current (c and r are the capacitance and resistance
per unit length of the layer implementing the resistance,
which are provided in the technology design kit). To val-
idate this approximate first-order RC circuit, several
physical resistances were simulated by extracting para-
sitics from the layout and applying a step current, in
order to evaluate the equivalent time constant
eq
of the
corresponding voltage waveform. In particular, by con-
sidering an unsilicided p-doped polysilicon layer with
the resistance r and capacitance c per unit length
reported in Table 1 for the 90-nm adopted technology,
results showed that (10) agrees very well with simula-
tions, with an error always lower than 4%.
IV. Delay Versus Bias Current
in Nanometer MCML Gates
In this section, the methodology and the circuit models
of transistors and the load resistances discussed in
Section II are applied to an inverter gate (Subsection A)
and to more complex MCML gates (Subsection B).
Compared to [2], [3] a strongly simplified procedure is
adopted to express the power-delay trade-off in a very
simple manner.
IV-B. MCML Inverter Gate
Let us consider the inverter gate in Figure 2, in which
transistors M1-M2 work in the saturation region most of
the time, and their source voltage is the same for both
input logic values (it is fixed by the NMOS transistor in
G
S
D
C
gd
G
S
C
gs

G
M
v
GS
C
db
D
C
sb
Figure 8. Equivalent linear model of NMOS transistors.
the ON state). Thus, the circuit can be linearized around
the logic threshold v
i
= 0, and the half-circuit concept
applies due to the symmetry and the differential signal-
ing. As shown in Figure 10, where the transistor model in
Figure 8 is substituted, the linearized half-circuit is a sim-
ple common-source circuit. By applying the time-constant
method to this circuit (i.e., by evaluating the time con-
stants associated with each capacitance when the others
are open-circuited), the time constants and
z
in (8) are
easily found to be
= R
D
(C
db
+C
gd
) +C
RD
+C
L
= R
D
(C
drain
+C
RD
+C
L
) (11a)
z
=
C
gd
G
M
=
C
gd
g
m
(0.6 +0.4) (11b)
A B ...
Physical Implementation
Load Resistance
...
L
A B
R
D
n
R
1
A B
Circuit Model (RC Ladder)
Decomposition Into
n Sections
R
D
B
A
Simplified RC Circuit Model
A
B
R
D
C
RD
=
C
R,TOT
3
= =
C
2
=
n
C
n
=
C
R,TOT
2n
C
R,TOT
2n
R
2
R
n
C
R,TOT
C
R,TOT
n
C
R,TOT
2n
C
R,TOT
2n
C
1
C
0
Figure 9. Physical implementation of the load resistance: derivation of its lumped circuit model.
where it was observed that all capacitances see the
same resistance R
D
in the evaluation of , and the sum
of C
gd
and C
db
was interpreted as the transistor capaci-
tive contribution C
drain
at the drain node. The (nega-
tive) zero time constant in (11b) is that of the
well-known common-source circuit, and from (8) tends
to increase the delay more significantly in down-scaled
technologies.
1
From (8) and (11a)(11b), the delay
PD
is equal to
PD
= 0.69R
D
__
C
drain
+
C
gd
(0.6 +0.4)
A
V
_
+C
RD
+C
L
_
.
(12)
Now, let us consider the explicit dependence of the
delay (12) on the bias current I
SS
, considering that in prac-
tical designs R
D
= V
SWING
/2 I
SS
as discussed in Section II-
B, and NMOS transistors are sized according to (7). Since
all NMOS capacitances are proportional to W, as pointed
out in Section III, from (7) the transistor capacitance
C
drain
(C
gd
) turns out to be proportional to I
SS
by a con-
stant C
drain,N
(C
gd,N
) which represents its value per unit
current (i.e., C
drain
= C
drain,N
. I
SS
and C
gd
= C
gd,N
. I
SS
).
By substituting (10), the MCML inverter delay in (12) is
equal to
PD
= 0.35 V
SWING
_
C
MOSnet,N
+
C
R,unit
I
2
SS
+
C
L
I
SS
_
(13)
where the NMOS network capacitive contributions per
unit current were lumped into a single contribution
C
MOSnet,N
C
MOSnet,N
= C
drain,N
+
C
gd,N
(0.6 +0.4)
A
V
. (14)
1
This is because the (overlap) gate-drain capacitance scales more slowly than the other parasitic capacitances, since the direct overlap size cannot lin-
early scale as reducing the minimum feature size. As another important aspect, the recent adoption of high- dielectrics tends to further increase this
capacitance [35].
2
When v
i
is applied to the upper transistors, the capacitances of lower transistors (that have already switched) do not contribute to the overall delay.
T
o model the effect of the distributed resistance and capacitance associated with the load resistance physical layer, we develop an equiv-
alent lumped RC circuit which has approximately the same dynamic behavior. To this aim, divide the strip in Figure 9 into a high number
n of small sections, each of which represented by a lumped resistance R
D
/n and a capacitance C
R,TOT
/n (split into two symmetric con-
tributions C
R,TOT
/2n, according to Figure 9). Thus the distributed RC strip can be described by the ladder network in Figure 9 with
C
0
= C
n
= C
R,TOT
/2n, C
1
= C
2
= . . . = C
n1
= C
R,TOT
/n, with C
n
being short-circuited to ground.
The equivalent impedance Z
D
of the RC ladder circuit in Figure 9 can be approximated to a first-order RC circuit with an equivalent time
constant
eq
, [47], [48],
Z
D
(s) = R
D
1 +b
1
s +b
2
s
2
. . .
1 +a
1
s +a
2
s
2
. . .
R
D
1
1 +s
eq
(A2.1)
which apparently consists of a resistance R
D
with a parallel equivalent capacitance C
R
such
eq
= R
D
C
R
. In (A2.1), the equivalent time-
constant
eq
is equal to a
1
b
1
[39], which in turn is easily evaluated through the time-constant method, [40], [41]. After simple but tedious
calculations, a
1
and b
1
for n we obtain
a
1
=
R
D
C
R,TOT
2
(A2.2)
b
1
= lim
n
R
D
C
R,TOT
n
2
n1
i=1
_
i
i
2
2
_
= lim
n
_
R
D
C
R
n
2
_
n(n 1)
2

1
n
_
(n 1)
3
3
+
(n 1)
2
2
+
(n 1)
6
___
=
R
D
C
R,TOT
6
(A2.3)
therefore the equivalent time constant
eq
is equal to R
D
C
R,TOT
/3 (thereby yielding C
RD
= C
R,TOT
/3).
The equivalent capacitance C
R
can be expressed as an explicit function of the bias current by observing that the resistance R
D
is equal
to r L, r being the resistance per unit length of the considered physical layer and L the strip length. The same observation holds for C
R,TOT
equal to c L, c being the capacitance per unit length of the considered layer. Accordingly, by expressing the strip length L as R
D
/r and
substituting the expression of R
D
= V
SWING
/2I
SS
we get the relationships (10).
APPENDIX II
It is worth noting that relationship (13) analytically
expresses the Power-Delay trade-off, since the delay is an
explicit function of I
SS
(which defines the static power
consumption P = V
DD
. I
SS
).
IV-B. Complex MCML Gates and Input Capacitance
In [2], it was shown that the power-delay interdependence
(13) actually holds for arbitrary MCML gates, as will be
shown in the following for various MCML gates. First, let
us consider the MCML MUX in Figure 3, whose worst-case
delay
2
PD,MUX
is obtained by applying the switching
input vi to transistors M1-M2 and keeping inputs A and B
constant. Without loss of generality, A and B can respec-
tively be assumed to be at the low and high level, thus M3
and M6 are in the saturation region, while M4 and M5 are
in cut-off. Observe that the XOR gate has the same delay
as the MUX, since its topology is obtained from the latter
by setting B =

A, hence in the following only the MUX
gate will be considered. By applying the adopted modeling
methodology, the MUX/XOR linearized half-circuit is
depicted in Figure 11, whose delay (8) is easily found to be
PD,MUX
=0.69
R
D
C
drain,3
+C
drain,5
+
C
gd
(0.6 +0.4)
A
V
+ C
RD
+C
L
+
1
G
M
(C
drain,1
+C
source,3
+C
source,4
)
0.69 R
D
2C
drain
+
C
drain
+2C
source
A
V
(0.6 +0.4)
+ C
RD
+C
L
(15)
where the sum of C
gs
and C
sb
is interpreted as the transistor
capacitive contribution C
source
at the source node. It has been
observed that all transistors have the same C
drain
(C
source
),
and the zero time constant
z
(given by (11b)) is negligible
when compared to the sum of the other capacitances, since
the latter is much greater than in the case of the inverter. By
following the same approach as the inverter, and remembering
that NMOS parasitic capacitances are proportional to I
SS
(i.e.,
C
drain
= C
drain,N
I
SS
, C
source
= C
source,N
I
SS
), the
delay is still given by (13) with an overall NMOS capacitance
per unit current equal to
C
MOSnet,N
= 2C
drain,N
+
C
drain,N
+2C
source,N
A
V
(0.6 +0.4). (16)
In regard to the D-latch, whose worst-case delay is the
clock-to-output delay occurring when input CLK switch-
es, this gate differs from the MUX/XOR gate only for the
source-coupled pair M5-M6 storing the previous output-
value for C LK = 0 due to their positive-feedback connec-
tion. Thus, the capacitive contributions of the D latch are
the same as the MUX/XOR gate, except for the additional
capacitance C
input
in (15) seen from the gate of M5 (M6).
As a consequence, the D latch clock-to-output delay is still
given by (13a),with an overall NMOS capacitance equal to
C
MOSnet,N
= 2C
drain,N
+
C
drain,N
+2C
source,N
A
V
(0.6 +0.4) +C
input,N
. (17)
where C
input,N
is obtained from the gate-source capaci-
tance expression and (7)
C
input,N
=
C
input
I
SS
=
2
3
W
I
SS
L C
OX
=
2
3
2
21
K
A
V
V
SWING
L C
OX
(18)
Observe that the generalization of (13) to arbitrary
gates is easily justified by considering that in arbitrary
MCML gates the parasitic capacitance C
RD
of the load
resistance is always responsible for the delay term
inversely proportional to I
2
SS
, and the external load capac-
itance C
L
determines the term inversely proportional to
I
SS
. Analogously, the NMOS transistor capacitances have
the same dependence on I
SS
and are responsible for the
delay term independent of I
SS
in (13), which is given by
the sum of capacitances at the output node and the other
capacitances multiplied by (0.6 +0.4)/A
V
.
IV-C. Simulation Results and Numerical Examples
The delay model was compared to Cadence Spectre simula-
tions with I
SS
widely ranging from 1 A to 100 A and load-
ing each gate with a number of equal gates (i.e., the fan-out
FO) ranging from 0 to 4, using the 90-nm CMOS technology
previously described. The delay obtained for FO equal to 0
and 4 is plotted in Figure 12 versus I
SS
in logarithmic scale
C
drain,N
1.38 E-11 F/A
C
input,N
1.67 E-11 F/A
C
R,unit
1.88 E-20 FA
C
source,N
= C
drain,N
+C
input,N
3.05 E-11 F/A
Table 2.
Delay coefficients for MCML gates in a 90-nm technology
(with V
SWING
= 700mV, A
V
= 2.2).
(due to the wide considered range of
the bias current), assuming V
SWING
equal to 700 mV and A
V
equal to 2.2.
In the same figure, the predicted
delay (13) with C
L
= C
input
F O
(with C
input
given by (18) is plotted
versus I
SS
, where the numerical
data reported in Table 2 were used.
The error, which is plotted versus
I
SS
in Figure 13, is always within 10%
and is typically in the order of a few
percent, with an average value of
4.7%. It is worth noting that the max-
imum error almost doubles (19%)
when the zero effect
C
gd,unit
(0.6 +0.4)/A
V
in (14) is
neglected, there by confirming that it
is an increasingly important contri-
bution in nanometer technologies.
In Figure 12, as expected the
delay does not depend on the fan-
out for very low values of the bias
current, since the dominant capac-
itive contributionis due to the par-
asitic capacitance associated with
the load resistance.This confirms
that the widely adopted assump-
tion of an ideal load resistor is far
from being realistic, since its para-
sitic capacitance in (10) must be
accounted for. Similar curves are
obtained for the other considered
MCML gates which are omitted for
the sake of compactness, and the
obtained numerical value of
C
NMOS,unit
in (13) is reported in
Table 3.
V. Power-Delay Trade-Offs and
Design Guidelines
From the general relationship (13),
different power-delay trade-offs
and several interesting properties
of MCML gates can be derived, by eventually measuring
the efficiency in the power-delay trade-off with the Power-
Delay Product PDP (i.e., the product of P = V
DD
I
SS
and
(13)), [6]. In any MCML gate with assigned values of
V
SWING
and A
V
, three different regions can be identified
when varying the power consumption; see Figure 14
which plots the trend of (13) versus I
SS
:
1) LOW POWER REGION: for low values of I
SS
such
that the term C
R,unit
/ I
2
SS
dominates over the other
two in (13), the parasitic capacitance associated
with the load resistance dominates over the others,
thus
PD
is inversely proportional to I
2
SS
. Accord-
ingly, PDP is inversely proportional to I
SS
, i.e., it
greatly increases when reducing the power con-
sumption. Thus, in low-power designs, a power sav-
ing is achieved at the cost of a much greater speed
penalty. Moreover, the delay (13) does not depend
on the NMOS network, thus it is the same for all
MCML gates, regardless of the implemented logic
function.
C
gd
G
M
v
i1,2
C
db
+ C
RD
+
V
i1,2
v
o1,2
R
D
C
L
Figure 10. Equivalent circuit of a MCML inverter.
C
gd1
G
M
v
i1
C
db1,2
+ (C
gs,3
+ C
sb,3
) + (C
gs,4
+ C
sb,4
)
+
v
i1
M1
G
M
v
gs3
v
o1
R
D
C
RD
+ C
L
+
v
gs3
(C
db3
+ C
gd,3
) + (C
db,5
+ C
gd,5
)
M3
Figure 11. Equivalent circuit of an MCML MUX gate.
1
10
100
1,000
10,000
1 2 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90 100
I
SS
(A)
P
O

(
p
s
)
FO = 0 (Simulated) FO = 0 (Predicted)
3
Figure 12. Inverter delay versus I
SS
with a fan-out of 0 and 4.
2) POWER-EFFICIENT REGION: for moderate values of
I
SS
such that the term C
L
/ I
SS
dominates over the
other two,
PD
is inversely proportional to I
SS
,
hence PDP is roughly constant. A power saving is
achieved at the cost of an equal speed penalty. In
this case, the delay mainly depends on the load.
3) INEFFICIENT DESIGN REGION: for high values of
I
SS
, the MCML delay can no longer be lowered
despite of a power increase, since it asymptotically
tends to a minimum value (achieved from (13) with
I
SS
) set by the NMOS capacitances: the gate
tends to be self-loaded due to the large transistor
size (7) which determines large NMOS capaci-
tances. In this case, MCML gates are very inefficient
in terms of the power-delay trade-
off, and the delay is mainly deter-
mined by the considered gate
through its NMOS network.
It is worth noting that all MCML
gates with the same V
SWING
and A
V
have the same power-delay inter-
dependence, with the only differ-
ence being the value of the
constant term C
MOSnet,N
in (13).
Hence more complex gates have a
greater C
MOSnet,N
and thus a
greater asymptotic minimum delay
PD,min
. Therefore the delay curves
versus I
SS
which analytically
describe the power-delay trade-off
of two different MCML gates only
differs for a different up/down shift
by the difference of the two differ-
ent minimum delay values, as
graphically reported in Figure 15.
According to the previous con-
siderations on the power-delay
trade-off, MCML gates will usually
be designed in the power-efficient
region where power and speed per-
formance are reasonably balanced,
whereas the low-power region will
be used only for non-critical paths.
In the following subsections, sim-
ple design criteria will be derived
from (13) in the three typical cases
(power-efficient, high-speed and
low-power design), and design con-
siderations on the power supply
voltage will be made.
V-A. Power-Efficient Design
To achieve an optimum power-
delay balance, it is necessary to
minimize the power-delay product
PDP = V
DD
I
SS

PD
(with
PD
given by (13), which is obtained by
setting its derivative to zero and
solving for I
SS
. The obtained bias
current which minimizes PDP is
12
10
8
6
4
2
0
2
4
6
8
1 2 3 9 10 20 30 40 50 60 70 80 90 100
I
SS
(A)
E
r
r
o
r
(
%
)
FO = 0 FO = 4
4 5 6 7 8
Figure 13. Error of the delay model versus I
SS
for an inverter gate with a fan-out
of 0 and 4.
Low-
Power
Power-
Efficient
I
SS
1
I
SS
1
2
Inefficient
Design

Delay
I
SS
Bias Current
(Power)
Gate-
Independent
Delay

Load-Dependent
Delay
Gate-Dependent
Delay
Constant
2
PD,min
PD,min
PD
Figure 14. General delay dependence on the bias current (or equivalently the power
consumption) in MCML gates.
I
SS,opt PDP
=
C
R,unit
C
MOSnet,N
. (19)
First, (19) yields C
MOSnet,N
I
SS,opt PDP
= C
R,unit
/
I
SS,opt PDP
, which means that a power-efficient design
leads to equal capacitive contributions of the NMOS net-
work and the load resistance, as reported in Figure 16.
Moreover, the optimum bias current (19) is independent
of the load, and the minimum power-delay product
(obtained by substituting (19) into PDP) turns out to be
PDP
opt,MC ML
0.35 V
DD
V
SWING
C
MOSnet,N
C
R,unit
+C
L
(20)
from which a PDP increase (i.e., a worse power effi-
ciency) is observed when increasing the load capaci-
tance C
L
, as well as C
R,unit
and C
MOSnet,N
. Observe that
C
R,unit
is proportional to V
SWING
(according to Appen-
dix II) and does not depend on A
V
, whereas the NMOS
contribution C
MOSnet,N
is proportional to W, which in
turn is proportional to ( A
V
/V
SWING
)
according to (7),
hence PDP in (20) is proportional to V
SWING
(3)/2
and
A
V
/2
. As a general result, in MCML gates designed for
power-efficiency the logic swing and the voltage gain
should be kept as low as possible within the range
allowed by the noise margin requirement. These con-
siderations are summarized in Figure 16, where it is
considered that for I
SS
= I
SS,opt PDP
the terms propor-
tional to 1/ I
SS
2
and the constant one are equal, thus it
lies at the boundary of the low-power and the power-
efficient region.
V-B. High-Speed Design
When a high speed performance is the principal goal, two
situations may occur. In the first one, a delay constraint
PD
derived from considerations at the gate level has to
be met by properly setting I
SS
to
I
SS
= 0.17 V
SWING
C
L
PD
PD,min
1 +
1 +11.4
C
R,N
C
2
L

PD
PD,min
V
SWING
(21)
that was obtained by solving (13) for I
SS
and substituting
its asymptotic minimum expression
PD,min
= lim
I
SS
PD
= 0.35 V
SWING
C
MOSnet,N
(22)
In the second case the speed potential must be
exploited as much as possible, thus
PD
has to be close
to (22) while keeping I
SS
within reasonable values, i.e.,
I
SS
should only be increased as long as a significant
speed improvement is achieved. To this aim, observe
that for sufficiently high values of I
SS
such that
C
MOSnet,N
> (C
R,unit
/ I
2
SS
+C
L
/ I
SS
), the constant term in
(13) dominates over the other two (i.e. the gate is self-
loaded), thus a high speed is achieved, but a further
increase in the bias current does not lead to a significant
speed advantage. In contrast, for lower values of I
SS
such that C
MOSnet,N
< (C
R,unit
/ I
2
SS
+C
L
/ I
SS
), the terms
depending on I
SS
dominate over the constant one, thus
a worse speed performance is achieved, but
PD
is highly
sensitive to a bias current increase. As a compromise, a
reasonable choice of I
SS
is achieved in the intermediate
case C
MOSnet,N
= (C
R,unit
/ I
2
SS
+C
L
/ I
SS
), which appar-
ently makes
PD
only twice the minimum achievable
(i.e.
PD
= 2
PD,min
), as reported in Figure 16. Thus, the
I
SS,opt delay
needed for such high-speed criterion is at
the boundary of the power-efficient and the inefficient
region in Figure 16. Moreover, under this current
C
MOSnet,N
I
SS,opt delay
is equal to (C
R,unit
/ I
SS,opt delay
+
C
L
), thus under this design criterion the NMOS capaci-
tance contribution equals the sum of C
L
and that of the
load resistance.
The bias current I
SS,opt delay
is easily found from (21)
by substituting
PD
=2
PD,min
I
SS,opt delay
= 0.17 V
SWING
C
L
PD,min
1 +
1 +11.4
C
R,unit
C
2
L

PD,min
V
SWING
(23)
which is easily found to be always greater than
I
SS,opt PDP
in (19) (or equal to, in the limit case C
L
= 0).
This means that a high speed is achieved at the cost of a
worse power efficiency, when compared to the case dis-
cussed in the previous subsection.
By reiterating the reasoning in Subsection A, C
MOSnet,N
is proportional to ( A
V
/V
SWING
)
, thus the delay
PD
= 2
PD,min
is proportional to A
V

/V
SWING
1
from
(22), therefore in high-speed designs the voltage gain
should be kept low, whereas the logic swing should be set
as high as possible, cf. Figure 16. Surprisingly, this is in
contrast with the usual belief that the high-speed feature
of MCML gates is due to the small logic swing, [16], that
probably is due to a superficial extension of well-known
properties of CML bipolar gates [2]. This consideration
can be intuitively justified by observing that an increase
in the logic swing reduces the transistor size (5) needed
to achieve a given A
V
, thereby reducing the NMOS capac-
itances which are the dominant contribution in the high-
speed region.
V-C. Low-Power Design
In low-power design, e.g., the design of non-critical paths,
the power consumption per gate allowed is usually an
assigned parameter that is derived from the requirements
at the system level. Therefore, the only design parameter
is the logic swing, whereas I
SS
is set to a very low value
chosen from the system considerations, thus the gate
works in the low-power region where the dominant term
is C
R,unit
/ I
2
SS
, and the delay is approxi-
mately
PD

= 0.35 V
SWING
C
R,unit
I
2
SS
(24)
which shows that in low-power design the
logic swing has to be set as low as possi-
ble, as in the case of power-efficient
design, whereas the voltage gain does not
affect the speed performance.
V-D. Remarks on the Power Supply
Voltage Sizing
Since the NMOS network in MCML gates
consists of stacked source-coupled
pairs associated with different levels,
according to Figure 17, only the transis-
tors at the first (upper) level can be
directly driven by the output of an
MCML gate, whereas the input voltages
of transistor pairs at lower levels are
progressively reduced through level
shifter stages to ensure operation in
the saturation region (for the reader
interested in the design of level shifter
stages, the subject is thoroughly
addressed in [2]). Each level
shifter stage is implemented
with a common-drain stage as in
Figure 17, [2], [18]. The mini-
mum V
DD
is found by consider-
ing the input v
i,n
at the n-th
lowest level in Figure 17, which
is set by the output voltage of
the preceding gate and the gate-
source voltage drop
(n 1)V
GS,shift
of (n 1) level
shifters, is equal to
V
DD
(n 1)V
GS,shift
in the case
of a high input. According to Fig-
ure 17, this voltage must accom-
modate the gate-source voltage
drop of the lowest transistor
driven by v
in
and the minimum
voltage drop across the bias
current source V
I SS,min
(equal to
a small V
DS,sat
100 mV in the
case of a simple current mirror
implementation), thus
I
SS
Low-
Power
Power-
Efficient
2
1
I
SS
1
Inefficient
Design
Delay
PD,min
I
SS
I
SS,opt_PDP
I
SS,opt_delay
Power-Efficient
Design
(Low V
SWING
, Low A
V
)
C
MOSnet
= C
RD
High-Speed
Design
(HighV
SWING
, Low A
V
)
C
MOSnet
= C
RD
+ C
L
PD
Bias Current
(Power)
Constant
2
PD,min
Low-Power
Design
(Low V
SWING
, any A
V
)
C
RD
>>C
MOSnet
+ C
L
Figure 16. Summary of design criteria of MCML gates.
Gate 1 (Simpler)
Delay
PD,min2
Bias Current
(Power)
Gate 2 (More Complex)
PD,min1
PD,min2

PD,min1
Figure 15. Delay curves versus I
SS
for two different MCML gates with the
same logic swing and load.
V
DD,min
= V
GS
+(n 1) V
GS,shift
+V
I SS,min
(25)
Equivalently eq. (26) sets the maximum number of
levels n for a given V
DD
(typically 23 [2], [18]). The
level shifter voltage drop V
GS,shift
is usually kept very
close to the transistor threshold voltage V
TH
by set-
ting the bias current to a rather low value. In regard to
V
GS
, it is the gate-source voltage of a transistor in the
ON state, i.e., with a current I
SS
, thus it is obtained by
solving (2)
V
GS
= V
TH
+
I
SS
K W
1
= V
TH
+
1
2
2
1
V
SWING
A
V
(26)
where (7) was substituted. From (26), in order to reduce
the supply voltage, the voltage swing should be kept as low
as possible, and the voltage gain should not be too low.
VI. A Design Example
Let us apply the concepts presented until now to the
carry logic of a Full Adder, which evaluates the carry out-
put C
out
= A B +C
in
A B
as a function of the carry

input C
in
and the two digit inputs A and B [6]. This block
is of utmost importance in arithmetic blocks such as
adders and multipliers, and its MCML topology is report-
ed in Figure 18, [15]. Its worst-case delay is represented
by the case when the maximum number of capacitances
switch. From Figure 18, this occurs when the lowest level
input B switches and the resulting current is steered to
the source-coupled pairs M5-M6 and M9-M10 (or equiva-
lently to M3-M4 and M7-M8), which occurs when A = 1
and C
in
= 0 (or A = 0 and C
in
= 1). This current path that
defines the worst-case delay is depicted with a dashed
line in Figure 18.
The delay of the circuit in Figure 18 is given by (13),
where C
MOSnet,N
is easily found by inspection of the
worst-case current path. Indeed, the capacitance at the
V
DD
v
o,1
v
o,2
R
D
R
D
...
2
nd
Level
n-th Level
I
SS
v
in,1
v
in,2
V
DD
1
st
Level
Shifter
V
DD
2
nd
Level
Shifter
V
DD
(n-1)-th Level
Shifter
(n-1)-th Level
Shifter
...
...
2
nd
Level
Shifter
V
DD
V
DD
V
DD
...
...
1
st
Level
1
st
Level
Shifter
V
GS,shift
V
GS,shift
V
GS,shift
V
GS
V
ISS,min
Figure 17. Level shifter stages to interface MCML gates.
output node (due to the drain connection of transistors
M8, M5 and M10) is equal to 3C
drain
, the capacitance at
node X (due to the source contribution of M9-M10 and the
drain contribution of M6) is equal to 2C
source
+C
drain
,
and the capacitance at node Y (due to the source contri-
bution of M5-M6 and the drain contribution of M2). As dis-
cussed in Section IV-B, by multiplying the capacitances
not connected to the output node by
(0.6 +0.4)/A
V
, C
MOSnet,N
is equal to
C
MOSnet,N
= 3C
drain,N
+
4C
source,N
+2C
drain,N
A
V
(0.6 +0.4 )
(27)
whose numerical value which the data of Table 2 is
reported in Table 3.
If a high speed is targeted, according to Figure 16 the
logic swing must be set to the maximum value 2V
TH
(as
discussed in Section II-A), which from Table 1 is equal to
700 mV. Moreover, assuming a noise margin of 160 mV is
required, from (5) a voltage gain A
V
equal to 2.2 is need-
ed. Assuming a load capacitance C
L
equal to 2 fF (i.e., the
rather high input capacitance of a gate with
I
SS
= 120 A), the obtained optimum bias current in
(23) is 22 A, and the delay is equal to about 60 ps. If an
optimum power-delay balance is desired, under the
same noise margin specification, the optimum bias
current (19) must be 12 A, and the delay is 100 ps. In a
more practical case where the Full Adder is loaded by an
equal one (i.e. C
L
= C
input,unit
I
SS
), the predicted and
simulated delay are plotted versus I
SS
in Figure 19, and
the error is always lower than 13% and its average value
is 5.5%. The results shows that for a very high speed the
optimum bias current is equal to 11.6 A giving a delay
of about 67 ps.
Finally, let us observe that other input transitions lead
to lower values of the delay, such in the case of the carry
input to carry output delay (with A = 0 and B = 1 or vice
versa) which is particularly important when defining the
speed performance of adder circuits [6]. This delay
CARRY
is given by (13) with C
MOSnet,N
given by the con-
tribution at the output node 3C
drain
, because the other
capacitances have already switched during the carry
input transition. In a high-speed design under the above
conditions, the obtained optimum current and delay are
respectively 18 A and 28 ps.
VIII. CONCLUSIONS
In this paper, an overview of techniques to manage the
power-delay trade-off in nanometer MCML circuits has
been presented. Compared to pre-
vious works, a strongly simplified
and comprehensive approach was
adopted which also account for
Deep-Submicron effects. As oppo-
site to the previous works of the
same authors, a physical resistance
load was assumed, whose distrib-
uted parasitic capacitance was sim-
ply modeled as a lumped circuits. It
was also shown that the usual
assumption made in the previous
papers of an ideal resistor (i.e.,
without parasitic capacitance) is
strongly unrealistic, especially in
low-power designs.
To understand better the
design trade-offs, simple models of
the noise margin and the delay
have been discussed. Further-
more, a simple approach to write
the delay by inspection of the gate
Y
V
DD
R
D
R
D
c
out,1
I
SS
C
out,2
C
L
C
L
M1 M2
A
2
A
1
A
1
A
2 M5 M6
M3
M3
M4
M4
M3 M4
B
1
B
2
C
in1
C
in2
C
in1
C
in2
X
Figure 18. Topology of the carry logic in an MCML Full Adder, with worst-case current
path in dashed line.
logic gate C
MOSnet,N
inverter 1.48 E-11
MUX/XOR 6.76 E-11
D latch 8.44 E-11
Full Adder (carry logic) 1.21 E-10
Table 3.
Overall NMOS capacitance per unit current of different MCML
logic gates(with V
SWING
= 700mV, A
V
= 2.2).
topology was extrapolated by generalizing the results of
a few gates. Interesting properties on trade-offs and
effect of scaling have been derived from these analytical
models: for example, it is shown that the DSM effects are
beneficial in terms of the noise margin. In particular,
three design targets have been discussed (i.e., low-
power, power-efficient and high-speed), and simple
design criteria to size the bias current, the logic swing
and the voltage gain have been found. These results,
which are summarized in Figures 14 and 16, provide pow-
erful information for decision taking in the design
process. Interestingly, it was shown that a high speed is
achieved by increasing the logic swing, as opposite to
the incorrect traditional belief that low logic swings
make MCML circuits faster. The
practical design of the carry
logic of a Full Adder has been
discussed presenting numerical
examples by considering a 90-
nm CMOS process.
Several challenges must still
be faced in the understanding of
MCML circuits, which are an
approach that is less mature than
the traditional CMOS logic. First,
the understanding of the interde-
pendence of design parameters
and the design criteria here
derived should be exploited to
implement automated design
flows to optimize effectively com-
plex MCML circuits with a rea-
sonable computational effort.
Secondly, although MCML cir-
cuits were shown to be less sensi-
tive to the problems related to the
technology downscaling than tra-
ditional CMOS logic, further prob-
lems will arise due to the
continuous reduction of the supply
voltage. Indeed, the latter will
increasingly limit the number of
logic levels within a gate (accord-
ing to (25)(26)), and thus the com-
plexity that can be implemented
into a single gate. This will trans-
late into a greater number of bias
current sources (and thus a
greater overall power consump-
tion) and interconnects (which
degrade the speed performance).
To overcome this limit, novel cir-
cuit approaches will be needed,
such as the low-voltage triple-tail cell approach that was
previously adopted in bipolar integrated circuits, [42][44].
Moreover, the logic swing reduction that will be forced by
the supply voltage scaling will determine a decrease in the
available noise margin, which will have to be recovered by
increasing the voltage gain (according to (5)) by means of
novel circuit techniques such as the introduction of posi-
tive feedback [45].
Third, efficient power-down techniques will be needed
to reduce eventually the static power consumption in
MCML blocks that do not perform useful computations,
while still keeping supply current variations within
reasonable bounds, in order to maintain the advantages
due to the almost constant supply current of MCML gates.
v
o
(v
i
)
v
i
0
0
V
SWING
2
V
SWING
2
V
SWING
2A
V
V
SWING
2A
V
Figure 20. DC transfer characteristics of a MCML gate with a completely saturated car-
rier velocity.
1
10
100
1,000
10,000
4 5 6 7 8 9 10 20 1 2 3 30 40 50 60 70 80 90 100
I
SS
(A)
(
p
s
)
Figure 19. Carry logic delay versus I
SS
with a unity fan-out.
References
[1] M. Mizuno et al., A GHz MOS adaptive pipeline techniques using
MOS current-mode logic, IEEE Journal of Solid-State Circuits, vol. 31,
no. 6, pp. 784791, June 1996.
[2] M. Alioto and G. Palumbo, Model and Design of Bipolar and MOS Cur-
rent-Mode Logic (CML, ECL and SCL Digital Circuits), Springer, 2005.
[3] M. Alioto and G. Palumbo, Design strategies for source coupled logic
gates, IEEE Trans. on CAS Part I, vol. 50, no. 5, pp. 640654, May 2003.
[4] M. Alioto and G. Palumbo, Power-delay optimization of D-
Latch/MUX source coupled logic gates, International Journal of Circuit
Theory and Applications, vol. 33, no. 1, pp. 6586, Jan./Feb. 2005.
[5] M. Alioto and G. Palumbo, Oscillation frequency in CML and ESCL
ring oscillators, IEEE Trans. on CAS Part I, vol. 48, no. 2, pp. 210214,
Feb. 2001.
[6] J. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits
(A Design Perspective), Prentice Hall, 2003.
[7] B. Razavi, Prospect of CMOS technology for high-speed optical
communication circuits, IEEE Jour. of Solid-State Circ., vol. 37, no. 9,
pp. 11351145, Sep. 2002.
[8] B. Razavi (Ed.), Monolithic Phase-Locked Loops and Clock Recovery
Circuits (Theory and Design), IEEE Press, 1996.
[9] C. Hung, B. Floyd, B. Park, and K. O, Fully integrated 5.35-GHz CMOS
VCOs and prescalers, IEEE Trans. on Microwave Theory and Techniques,
vol. 49, no. 1, Jan. 2001.
[10] C. Lam and B. Razavi, A 2.6-GHz/5.2-GHz Frequency Synthesizer in
0.4-m CMOS Technology, IEEE Jour. of Solid-State Circ., vol. 35, no. 5,
pp. 788794, May 2000.
[11] H. Nosaka, K. Isshii, T. Enoki, and T. Shibata, A 10-Gb/s data-pattern
independent clock and data recovery with a two-mode phase compara-
tor, IEEE Jour. of Solid-State Circuits, vol. 38, no. 2, pp. 192197, Feb. 2003.
[12] S.-T. Yan and H. Luong, A 3-V 1.3-to-1.8-GHz CMOS voltage-con-
trolled oscillator with 0.3-ps Jitter, IEEE Trans. on Circuits and Systems
Part II, vol. 45, no. 7, pp. 876880, July 1998.
[13] B. Razavi, Design of Integrated Circuits for Optical Communications,
McGraw-Hill, 2003.
[14] T.H. Lee, The Design of CMOS Radio Frequency Integrated Circuits,
Cambridge University Press, 2nd edition, 2003.
[15] J. Musicer and J. Rabaey, MOS current mode logic for low power,
low noise CORDIC computation in mixed-signal environments, Proc. of
ISLPED 2000, pp. 102107, 2000.
[16] A. Tanabe, M. Umetani, I. Fujiwara, T. Ogura, K. Kataoka, M. Okiara,
H. Sakuraba, T. Endoh, and F. Masuoka, 0.18-m CMOS 10-Gb/s
Multiplexer/Demultiplexer ICs Using Current Mode Logic with Tolerance
to Threshold Voltage Fluctuation, IEEE J. of Solid-State Circuits, vol. 36,
no. 6, June 2001.
[17] R. Senthinatan and J. Prince, Application specific CMOS output
driver circuit design techniques to reduce simultaneous switching
noise, IEEE Jour. of Solid-State Circuits, vol. 28, no. 12, pp. 13831388,
Dec. 1993.
[18] S. Maskai, S. Kiaei, and D. Allstot, Synthesis techniques for CMOS
folded source-coupled logic circuits, IEEE J. Of Solid State Circuits, vol.
27, no. 8, pp. 11571167, Aug. 1992.
[19] D. Allstot, S. Chee, S. Kiaei, and M. Shristawa, Folded source-
coupled logic vs. CMOS static logic for low-noise mixed-signal ICs, IEEE
Trans. on CASPart I, vol. 40, no. 9, pp. 553563, Sep. 1993.
[20] S. Kiaei, S. Chee, and D. Allstot, CMOS source-coupled logic for
mixed-mode VLSI, Proc. Int. Symp. Circuits Systems, pp. 16081611, 1990.
[21] H. Ng and D. Allstot, CMOS current steering logic for low-voltage
mixed-signal integrated circuits, IEEE Trans. on VLSI Systems, vol. 5,
no. 3, pp. 301308, Sep. 1997.
[22] B. Stanistic, N. Verghese, R. Rutenbar, L. Carley, and D. Allstot,
Addressing substrate coupling in mixed-mode ICs: simulation and
power distribution synthesis, IEEE Jour. of Solid-State Circuits, vol. 29,
pp. 226238, Mar. 1994.
[23] International Technology Roadmap for Semiconductors, Available:
http://public.itrs.net.
[24] R. Singh (Ed.), Signal Integrity Effects in Custom IC and ASIC Design,
IEEE Press, 2002.
[25] B. Del Signore, D. Kerth, N. Sooch, and E. Swanson, A monolithic 20-b
delta-sigma A/D converter, IEEE J. Solid-State Circuits, vol. 25, pp. 13111317,
Dec. 1990.
[26] H. Leopold, G. Winkler, P. OLeary, K. Ilzer, and J. Jernej, A mono-
lithic CMOS 20-b analog-to-digital converter, IEEE J. Solid-State Circuits,
vol. 26, pp. 910916, July 1991.
[27] I. Fujimori et al., A 5-V single chip delta-sigma audio A/D converter
with 111 dB dynamic range, IEEE J. Of Solid State Circuits, vol. 32, pp.
329336, Mar. 1997.
[28] S. Jantzi and K. Martin, A. Sedra, Quadrature bandpass modu-
lator for digital radio, IEEE J. Solid State Circuits, vol. 32, pp. 19351949,
1997.
[29] B. Kup, E. Dijkmans, P. Naus, and J. Sneep, A bit-stream digital-to-
analog converter with 18-b resolution, IEEE J. Solid-State Circuits, vol. 26,
pp. 17571763, Dec. 1991.
[30] J. Kundan and S. Hasan, Enhanced folded source-coupled logic
technique for low-voltage mixed-signal integrated circuits, IEEE Trans.
on CASPart II, vol. 47, no. 8, pp. 810817, Aug. 2000.
[31] H. Lee, D. Hodges, and P. Gray, A self-calibrating 15-bit CMOS A/D
converter, IEEE Jour. of Solid-State Circuits, vol. 19, pp. 813819, Dec.
1984.
[32] D. Su, M. Loinaz, S. Masui, and B. Wooley, Experimental results and
modeling techniques for substrate noise in mixed-signal integrated cir-
cuits, IEEE Jour. of Solid-State Circuits, vol. 28, pp. 420430, Apr. 1993.
[33] K. Bernstein et al., High Speed CMOS Design Styles, Kluwer Academic
Publishers, 1999.
[34] S. Bruma, Impact of on-chip process variations on MCML perform-
ance, Proc. IEEE International Systems-on-Chip Conference (SOCC03), pp.
135140, 2003.
[35] B.P. Wong, A. Mittal, U. Cao, and G. Starr, Nano-CMOS Circuit and
Physical Design, John Wiley & Sons, 2005.
[36] D. Chinnery and K. Keutzer, Closing the Gap between ASIC & Custom,
Kluwer Academic Publishers, 2002.
[37] T. Sakurai and A.R. Newton, Alpha-Power law MOSFET model and
its applications to CMOS inverter delay and other formulas, IEEE Jour.
on Solid-State Circuits, vol. 25, no. 2, pp. 584594, Apr. 1990.
[38] M. Alioto, G. Palumbo, and S. Pennisi, Modeling of Source Coupled
Logic Gates, International Journal of Circuit Theory and Applications, vol.
30, no. 4, pp. 459477, 2002.
[39] W. Elmore, The transient response of damped linear networks, J.
Appl. Phys., vol. 19, pp. 5563, Jan. 1948.
[40] B. Cochrun and A. Grabel, A Method for the Determination of the
Transfer Function of Electronic Circuits, IEEE Trans. on Circuit Theory,
vol. CT-20, no. 1, pp. 1620, Jan. 1973.
[41] G. Palumbo and S. Pennisi, Feedback Amplifiers Theory and Design,
Kluwer Academic Publishers, 2002.
[42] B. Razavi, Y. Ota, and R. Swartz, Design techniques for low-voltage
high speed digital bipolar circuits, IEEE Jour. of Solid-State Circ., vol. 29,
no. 2, pp. 332339, Mar. 1994.
[43] G. Schuppener, C. Pala, and M. Mokhtari, Investigation on low-volt-
age low-power silicon bipolar design topology for high-speed digital cir-
cuits, IEEE Jour. Of Solid-State Circ., vol. 35, no. 7, pp. 10511054, July
2000.
[44] M. Alioto, R. Mita, and G. Palumbo, Performance evaluation of the
low-voltage CML D-Latch topology, IntegrationThe VLSI Journal, Spe-
cial Issue in Analog and Mixed-Signal IC Design and Design Methodolo-
gies (edited by Francisco V. Fernandez), vol. 36, no. 4, pp. 191209, Nov.
2003.
[45] M. Alioto, L. Pancioni, S. Rocchi, and V. Vignoli, Modeling and Eval-
uation of Positive-Feedback Source-Coupled Logic, IEEE Trans. on
CASPart I, vol. 51, no. 12, pp. 23452355, Dec. 2004.
[46] P. R. Gray and R. G. Meyer, Analysis and Design of Analog Integrated
Circuits, John Wiley & Sons, 1977.
[47] J. L. Wyatt, Jr., Signal propagation delay in RC models for intercon-
nect, Circuit Analysis, Simulation and Design, Part II: VLSI Circuit Analysis
and Simulation, A. Ruehli (Ed.), vol. 3 in the series Advances in CAD for
VLSI, North-Holland, 1987.
[48] M. Alioto G. Palumbo, and M. Poli, Evaluation of energy consump-
tion in RC ladder circuits driven by a ramp input, IEEE Trans. on VLSI
Systems, vol. 12, no. 10, pp. 10941107, Oct. 2004.
Massimo Alioto (M01) was born in
Brescia, Italy, in 1972. He received the lau-
rea degree in Electronics Engineering and
the Ph.D. degree in Electrical Engineering
from the University of Catania (Italy) in
1997 and 2001, respectively. In 2002, he
joined the Engineering faculty of the Uni-
versity of Siena as a Research Associate and in the same
year as an Assistant Professor. In 2006, he became Associ-
ate Professor in the same faculty.
Since 2001 he has been teaching undergraduate and
graduate courses on basic electronics, microelectronics
and advanced VLSI digital design. He has authored or co-
authored over 80 journals and conference papers. He is
co-author of the book Model and Design of Bipolar and
MOS Current-Mode Logic: CML, ECL and SCL Digital Circuits
(Springer, 2005). His primary research interests include:
modeling and optimized design of CMOS high-perform-
ance digital circuits in terms of high-speed or low-power
dissipation, transistor- and gate-level design of arithmetic
circuits, design of circuits for cryptographic applications
(e.g., random number generators, circuits resistant to Dif-
ferential Power Analysis), and design for variability. His
research was previously focused also on the modeling
and the design of bipolar CML/ECL circuits, as well as adi-
abatic logic.
Gaetano Palumbo was born in Catania,
Italy, in 1964. He received the laurea
degree in Electrical Engineering in 1988
and the Ph.D. degree from the University
of Catania in 1993. Since 1993 he conducts
courses on Electronic Devices, Electron-
ics for Digital Systems and basic Elec-
tronics. In 1994 he joined the DEES (Dipartimento
Elettrico Elettronico e Sistemistico), now DIEES (Diparti-
mento di Ingegneria Elettrica Elettronica e dei Sistemi), at
the University of Catania as a researcher, subsequently
becoming associate professor in 1998. Since 2000 he is a
full professor in the same department.
His primary research interest has been analog circuits
with particular emphasis on feedback circuits, compensa-
tion techniques, current-mode approach, low-voltage cir-
cuits. Then, his research has also embraced digital circuits
with emphasis on bipolar and MOS current-mode digital
circuits, adiabatic circuits, and high-performance building
blocks focused on achieving optimum speed within the
constraint of low power operation. In all these fields he is
developing some the research activities in collaboration
with STMicroelectronics of Catania.
He was the co-author of three books CMOS Current
Amplifiers, Feedback Amplifiers: theory and design and
Model and Design of Bipolar and MOS Current-Mode Logic
(CML, ECL and SCL Digital Circuits) all by Kluwer Academ-
ic Publishers, in 1999, 2001 and 2005, respectively, and a
textbook on electronic devices in 2005. He is a contributor
to the Wiley Encyclopedia of Electrical and Electronics Engi-
neering. He is the author of more than almost 300 scientific
papers on referred international journals (over 110) and in
conferences. Moreover he is co-author of several patents.
Since June 1999 to the end of 2001 and since 2004 to
2005 he served as an Associated Editor of the IEEE Trans-
actions on Circuits and Systems part I for the topic Ana-
log Circuits and Filters and Digital Circuits and
Systems, respectively. Since 2006 he is serving as an
Associated Editor of the IEEE Transactions on Circuits and
Systems part II.
In 2005 he was a panelist in the scientific-disciplinaire
area 09industrial and information engineering of the
CIVR (Committee for Evaluation of Italian Research), which
has the aim to evaluate the Italian research in the above
area for the period 20012003.
In 2003 he received the Darlington Award. Prof. Palum-
bo is an IEEE Senior Member.

MCML

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MCML

Uploaded by

Copyright:

Available Formats

40 IEEE CIRCUITS AND SYSTEMS MAGAZINE 1531-6364/06/$20.

002006 IEEE FOURTH QUARTER 2006

, thus the delay

as a function of the carry

You might also like