You are on page 1of 4

ASIC Implem

Omkar Dave
VLSI
dave.omkar@hotmail.co

Abstract This paper presents a design o
arithmetic logic unit (ALU). The novelty
ALU is it gives high performance through
concept compare to non-pipeline ALU.
technique where multiple instruction
overlapped. The pipeline modules are inde
other. All the modules in the ALU design ar
verilog HDL. Design functionalities are va
simulation and compilation. Test vectors are
the outputs as opposed to the calculated
simulation is done with ModelSim simu
synthesis is done with RTL Compiler tool. Ph
this architecture is done with encounter caden
technology.

Keywords pipelined ALU, Instruction
memory, operand generator.


INTRODUCTION

Pipeline is an important methodolo
the design. It allows many operations to occu
ALU is a fundamental block of the central pro
computer, that performs integer arithmetic
addition, subtraction, multiplication, divisi
operations like logical and, or ,not, xor, shifti
single instruction, ALU takes N clock cycles.
number of instructions it will take (M*N) cloc

To decrease the number of clock c
ALU is proposed. Generally, in Pipelining if N
to be executed and total number of stages req
the particular instruction is M stages, then
clock cycles required to execute instructio
(N+M-1), which is obviously lesser than a sim
CONVENTIONAL ALU
In conventional or non-pipelin
instruction can be executed in one clock cycl
cycle for single clock ALU or multi clock A
Execution of whole instruction can be divide
Fetch (IF), Decode, Execution, Memory opera

mentation of Pipelined A
1
, Deepak Singh Yadav
2
, Jay Kothari
3
, Jayakrishnan
Department, VIT University Vellore, Tamil Nadu
om, deepaksingh.yadav123@gmail.com, jaykothari24@yahoo.c

of 4-bit pipeline
of the pipelined
h the pipelining
Pipelining is a
executions are
ependent of each
re realized using
alidated through
created to verify
results. Design
ulator and RTL
hysical design of
nce tool in 180nm
memory, data
ogy to speed up
ur in parallel. The
ocessing unit of a
operations like
ion and logical
ing. To execute a
So to execute M
ck cycles.
cycles, pipelined
N instructions are
quired to execute
total number of
ons is given by,
mpler ALU.
U
ned ALU, one
le or multi clock
ALU respectively.
ed as Instruction
ation and Write-

Back operation. In multi clock ALU
of these operations will be going on
Loading of next instruction will
previous instruction. Due to this t
achieve higher throughput it is poss
of next instruction in same clock cy




Fig.1 Stages in Pip


NON-PIPELINE

Short for Arithmetic Logi
many components within a comp
performs mathematical, logical, an
computer and is the final proc
processor. After the information h
ALU, it is sent to the computer m
processors, the ALU is divided into
and the LU. The AU performs the
the LU performs the logical operatio


Fig. 2 Arithmetic And Logic Un
ALU
n.P
4
com
U, for any clock cycle any
n. In non-pipelined ALU
be start after executing
throughput is low. So to
sible to pipelined operation
ycle. As shown in Fig.1[4]

pelining[4]
ED ALU
c Unit, ALU is one of the
puter processor. The ALU
nd decision operations in a
essing performed by the
has been processed by the
memory. In some computer
o two distinct parts, the AU
e arithmetic operations and
ons.

nit schematic symbol[4]

191 978-1-4673-6126-2/13/$31.00 c 2013 IEEE


OUTPUT WAVEFORM


Fig. 3 Waveform for Non-pipelined



PROPOSED PIPELINED A

Architecture:
In this paper we have implemente
ALU in following ways:

Pipelining mainly consists of 5 sta
Fetch, Instruction Decode, Execution, Mem
back. Each of this unit will take single clock
the specific task[6]. In pipelining process, if f
getting fetched in first clock cycle, next ins
fetched in second clock cycle. Generally, Pipe
[(N-1) +K] clock cycles, where N is number o
be carried out and K is number of stages
pipelining procedure. Here, number of stages a

In this paper, we mainly focused
Memory, Data Memory, Operand Ge
Multiplexer block. Block diagram of pipeline
given in fig.2.

WORKING
Instruction Memory will fetch the 1
which is stored in the memory, by giving th
address[7]. Program counter is used to i
pointer from 00000 to 11111, which is 5-bit
16-bit instruction will be fed to Data Memory
Fetch unit consists of Instruction Memory bloc


d ALU
ALU
ed 4-bit pipeline
ages: Instruction
mory, and Write
cycle to perform
first instruction is
struction will get
elining consumes
of Instructions to
involved during
are 4.
d on Instruction
enerator[5], and
ed architecture is
16-bit instruction
he corresponding
increase address
of Address. This
y. So, Instruction
ck[1].
1'b1
D FF
Instruction
Memory
Address
Instruc
tion
Data
Memory
7:4
3:0
Addr2
Addr1
Data1
Data2
Ope-
rand
Gene
rator
1'b0
D FF
3:0
15:12
16 16
Instruction
Delay1
operand1
operand2
Control
D FF
Instruction
Delay2
5

Fig.4 Proposed block diagram

Data Memory and Ope
included in second unit, which is
Data Memory is nothing but Dual
write memory, which consists of tw
4-bits and 2 output data D1 and D2
data memory is LSB 8-bits of
Operand Generator will generate t
the control signal opcode whic
instruction. The operands can be 0
which consists of LSB of the instruc

In the Instruction Decod
operations are taken place. It includ
addition, subtraction, left shift, r
logical OR, logical EX-OR, Inv
variables [2]. It may also possible
but for simplicity we are consideri
These Arithmetic operations are fe
operation will be selected by fir
instruction.

The third unit is executio
getting the ALU output, which
Multiplexer. This ALU output will
flop and clocked output will be stor

The last unit is write bac
output to the data memory in each
to the 4-bit Address.

AND
OR
>>
<<
NOT
>
<
Output
Selector
n
15:12 Control
D FF
Instruction
Delay3
D FF
Data
Memory
Alu_out
Alu_out_
clocked
11:8

m of pipelined ALU
rator Generator block is
s Instruction Decode. This
l port read and single port
wo addresses, input data of
2 of 4-bits each. Address of
16-bit output instruction.
two operands by selecting
ch is MSB 4-bits of the
0 , data D1,data D2 or data
ction.
de unit various arithmetic
des ten operations which are
right shift, logical AND,
vert, comparison of two
to include more operators
ing only these instructions.
ed to multiplexer unit. The
rst four MSB bits of the
on unit which is used for
we will get through the
be clocked through D flip-
ed in Data memory.
ck unit. It stores the ALU
clock cycle corresponding
192 2013 International Conference on Green Computing, Communication and Conservation of Energy (ICGCE)


Here for pipelining we are using D-f
each unit. Because of D flip-flops the critic
reduced and ALU operation will become
improves the system performance. But due t
flop area and power requirement of the
increased.

RESULTS AND DISCUSSIO

From the comparison, as shown i
evident that the area for pipelined ALU is i
18.25% when compared with the conventiona
also increased by 54.08% for pipelined A
conventional ALU. But more important is red
path of pipelined ALU can be observed. For
reduction in delay is around 25% compare
ALU. When timing is a constraint, it is bette
pipelined ALU.

Simulation of this architecture
ModelSim simulator and corresponding outp
given in fig. 3. RTL code of this archite
synthesized in using RTL Compiler Cade
Schematic and values for power, area an
architecture is given in fig.5. And table 1 res
is obtained from RTL Compiler. Physical des
in 180 nm technology by using Cadence enco
layout of this architecture is given in fig.4.Sy
of pipelined ALU is shown in Fig. 6 which is
RTL compiler.


Table.1. Comparison between conventional ALU a


Parameters

Pipelined ALU

No

Area (um
2
)

19845

Power (mW)

8.467


Delay (ps)

1292

flip-flop between
cal path will get
e faster and it
to use of D-flip-
design will be
ON
in table 1, it is
increased around
al ALU. Power is
ALU compare to
duction in critical
r pipelined ALU
to conventional
er to go with the
is done with
put waveform is
ecture has been
ence tool. RTL
nd delay of this
spectively, which
ign is carried out
ounter tool. Chip
ynthesized result
s implemented in
and pipelined ALU
on-pipelined
ALU


16782
5.495
1662
OUTPUT WAVEFORM

Fig.5 Waveform for p



RTL SCHEMAT


Fig.6 Synthesized result o

PHYSICAL IMPLEM
The design of pipelined AL
CMOS process by Cadence SOC en


ipelined ALU
TIC VIEW

of pipelined ALU
MENTATION
LU is laid out using 180nm
ncounter tool.
2013 International Conference on Green Computing, Communication and Conservation of Energy (ICGCE) 193


Fig.7 Layout for pipelined ALU

CONCLUSIONS
In this paper, we have implemented
ALU in 180 nm technology. And compare the
pipelined ALU with conventional ALU. By
various test vectors the proposed approach o
design using Verilog HDL is success
implemented, synthesized and tested.

REFERENCES
[1] An Article on Digital system design Lab
blocks of a MIPS CPU(II) 16-bit ALU de

[2] Beom Seon Ryu, Jung Sok Yi, Kie Youn
Cho, a design of low power 16-B AL
TENCON.

[3] Beom Seon Ryu, Jung Sok Yi, Kie Youn
Cho, multi-level approache to low p
design 0-7803-5728-0/99 $4.00 0 1999 I

[4] A lecture pdf on Instruction pipelining

[5] Junkai Sun, Anping Jiang The Power Dissip
Comparison of Different ALU Architectur
International Conference on Mechanical and El
Technology (ICMET 2010)

[6] An Article on Basic pipelining by B.Ram
CS506.

[7] An article on Enhancing performance wit


U
d 4-bit pipelined
e performance of
simulating with
of pipeline ALU
sfully designed,
2 Basic building
esign.
ng Lee, Tae Won
LU 1999 IEEE
ng Lee, Tae Won
ower 16bits alu
IEEE
.
pation
res 2010
lectrical
mamurthy
th pipelining
194 2013 International Conference on Green Computing, Communication and Conservation of Energy (ICGCE)

You might also like