Professional Documents
Culture Documents
Datapath
Control
EECC550 - Shaaban
#1 Lec # 5 Winter 2012 12-18-2012
PCSrc
Branch
Zero
PC+4
ALUop
(2-bits)
Zero
Function
Field
32
Branch
Target
imm16
16
32
Data In
32
Clk
32
0
Mux
Clk
Extender
Clk
MemWr MemtoReg
Main
ALU
ALU
busW
Mux
PC
Mux
Adder
Rs Rt
5
5
R[rs]
busA
Rw Ra Rb
32
32 32-bit
R[rt]
Registers
busB
0
32
ALU
Control
RegWr 5
0
T = I x CPI x C
Imm16
Rd Rt
0
1
Adder
PC Ext
imm16
Rd
RegDst
00
Rt
Instruction<31:0>
<0:15>
Rs
<11:15>
Adr
<16:20>
<21:25>
Inst
Memory
WrEn Adr
Data
Memory
ExtOp ALUSrc
EECC550 - Shaaban
#2 Lec # 5 Winter 2012 12-18-2012
32
Shift
left 2
26
28
PC + 4 [3128]
Add
PC +4
32
PC +4
32
M
u
x
PC +4
Add
ALU
result
Branch
Target
32
M
u
x
Shift
left 2
RegDst
Jump
Branch
Opcode
MemRead
Instruction [3126]
MemtoReg
Control
ALUOp
MemWrite
ALUSrc
RegWrite
PC
Instruction [2521]
rs
Instruction [2016]
rt
Read
address
Instruction
[310]
Instruction
memory
Read
register 1
Read
data 1
Read
register 2
Zero
0
Instruction [1511]
rd
Instruction [150]
imm16
M
u
x
Read
data 2
Write
register
Write
data
16
R[rs]
ALU
R[rt]
0
M
u
x
ALU
result
Data
memory
1
Registers
Sign
extend
Address
R[rt]
Write
data
Read
data
M
u
x
32
32
ALU
control
Function Field
Instruction [50]
In this book version, ORI is not supportedno zero extend of immediate needed.
ALUOp (2-bits)
00 = add
01 = subtract
10 = R-Type
EECC550 - Shaaban
#3 Lec # 5 Winter 2012 12-18-2012
CPI = 1
Cycle time for load is longer than needed for all other instructions.
EECC550 - Shaaban
#4 Lec # 5 Winter 2012 12-18-2012
op
2 ns
RegDst
RegWr
MemWr
Result Store
2 ns
Reg.
Wrt
MemRd
MemWr
Mem
Access
ExtOp
ALUSrc
ALUctr
ALU
1 ns
Data
Mem
1 ns
Ext
Register
Fetch
Instruction
Fetch
PC
Next PC
Equal
Branch, Jump
fun
2 ns
EECC550 - Shaaban
#5 Lec # 5 Winter 2012 12-18-2012
2 ns
Inst Memory
Reg File
mux
1 ns
mux
Reg File
Critical Path
Store
PC
Inst Memory
Reg File
Branch
PC
Inst Memory
Reg File
ALU
mux
setup
2 ns
2 ns
ALU
Data Mem
1 ns
mux setup
cmp
ALU
Data Mem
mux
EECC550 - Shaaban
#6 Lec # 5 Winter 2012 12-18-2012
Clk
.
.
.
.
.
.
.
.
.
.
.
.
Critical Path
LW in this case
Critical path: the slowest path between any two storage devices
Clock Cycle time is a function of the critical path, and must be
greater than:
Clock-to-Q + Longest Delay Path through the Combination Logic
+ Setup + Clock Skew
Assuming the following datapath/control hardware components delays:
Memory Units: 2 ns ALU and adders: 2 ns
Register File: 1 ns
Control Unit < 1 ns
EECC550 - Shaaban
#7 Lec # 5 Winter 2012 12-18-2012
storage element
Two shorter
cycles
One long
cycle
e.g CPI =1
Acyclic
Combinational
Logic
Cycle 1
Acyclic
Combinational
Logic (A)
e.g CPI =2
=>
Storage Element:
Register or memory
Cycle 2
storage element
Place registers to:
Get a balanced clock cycle length
Save any results needed for the remaining cycles
storage element
Acyclic
Combinational
Logic (B)
storage element
EECC550 - Shaaban
#8 Lec # 5 Winter 2012 12-18-2012
Instruction
Fetch
Next
Instruction Mem[PC]
Update program counter to address
Instruction
of next instruction
Instruction
PC
PC + 4
Decode
Execute
Done by
Control Unit
Result
Store
T = I x CPI x C
Common
steps
for all
instructions
EECC550 - Shaaban
#9 Lec # 5 Winter 2012 12-18-2012
Instruction
Fetch
Cycle
(IF)
Instruction
Decode
2 Cycle
(ID)
Execution
Cycle
3 (EX)
Data
Memory
Access
4 Cycle
(MEM)
Result Store
MemWr
RegDst
RegWr
Reg.
File
MemRd
MemWr
ALUctr
ALUSrc
Exec
Data
Mem
Operand
Fetch
Instruction
Fetch
2 ns
ExtOp
1 ns
C = 2 ns
f = 500 MHz
1 ns
2 ns
2 ns
Mem
Access
To Control Unit
PC
Next PC
Branch, Jump
Thus:
Write back
Cycle
(WB)
EECC550 - Shaaban
#10 Lec # 5 Winter 2012 12-18-2012
Instruction
Decode
(ID)
2 1ns
MemToReg
MemRd
MemWr
ALUSrc
ALUctr
Execution
(EX)
2ns
RegDst
Reg.
RegWr
File
Equal
Write to
Register
Data
Mem
Reg
File
Mem
Access
Instruction
Fetch
(IF)
2ns
IR
Instruction
Fetch
Read
Registers
Ext
ALU
ExtOp
To Control Unit
PC
Branch, Jump
Next PC
1
Memory
Write Back
(MEM)
(WB)
3
4 2ns
5
1ns
All clock-edge triggered (not shown register write enable control lines)
Registers added:
IR:
Instruction register
A, B: Two registers to hold operands read from register file. i.e R[rs], R[rt]
R:
or ALUOut, holds the output of the main ALU ALU result
M:
or Memory data register (MDR) to hold data read from data memory
CPU Clock Cycle Time: Worst cycle delay = C = 2ns
Assuming the following datapath/control hardware components delays:
Memory Units: 2 ns ALU and adders: 2 ns
Register File: 1 ns
Control Unit < 1 ns
EECC550 - Shaaban
#11 Lec # 5 Winter 2012 12-18-2012
R-Type
IF
Instruction
Fetch
IR Mem[PC]
IR
ID
Instruction
Decode
A R[rs]
A R[rs]
R[rt]
Mem[PC]
R[rt
Load
IR
Mem[PC]
A R[rs]
B R[rt
Store
IR
Branch
IR
Mem[PC]
A R[rs]
R[rs]
Mem[PC]
R[rt]
R[rt]
Zero A - B
If Zero = 1:
EX
Execution
R A funct B
R A OR ZeroExt[imm16]
R A + SignEx(Im16)
R A + SignEx(Im16)
PC PC + 4 +
(SignExt(imm16) x4)
else (i.e Zero =0):
PC PC + 4
MEM
WB
Memory
Write
Back
M Mem[R]
R[rd] R
R[rt] R
R[rt]
PC PC + 4
PC PC + 4
PC PC + 4
Mem[R]
PC
PC + 4
EECC550 - Shaaban
#12 Lec # 5 Winter 2012 12-18-2012
Load
IF
ID
CPI = 5
EX
MEM
WB
C = 2 ns f = 500 MHz
EECC550 - Shaaban
#14 Lec # 5 Winter 2012 12-18-2012
Cycle 1
Cycle 2
8 ns
Load
Store
Waste
WB
Store
IF
ID
EX
R-type
MEM IF
1 CPI 5
Single-Cycle CPU:
CPI = 1 C = 8ns f = 125 MHz
One million instructions take =
I x CPI x C = 106 x 1 x 8x10-9 = 8 msec
T = I x CPI x C
Assuming the following datapath/control hardware components delays:
Memory Units: 2 ns ALU and adders: 2 ns
Register File: 1 ns
Control Unit < 1 ns
Multi-Cycle CPU:
CPI = 3 to 5 C = 2ns f = 500 MHz
One million instructions take from
106 x 3 x 2x10-9 = 6 msec
to 106 x 5 x 2x10-9 = 10 msec
depending on instruction mix used.
EECC550 - Shaaban
#15 Lec # 5 Winter 2012 12-18-2012
State specifies control points (outputs) for Register Transfer. AKA Hardwired Control
Control points (outputs) are assumed to depend only on the current state
and not inputs (i.e. Moore finite state machine)
Transfer (register/memory writes) and state transition occur upon exiting
the state on the falling edge of the clock.
inputs (opcode, conditions)
Last State
Next State
Logic
State X
Control State
Register Transfer
Control Points
e.g Flip-Flops
Current
state
Current State
Output Logic
Next State
outputs (control points)
Moore Finite
State Machine
To datapath
Vs. Mealy ?
EECC550 - Shaaban
#16 Lec # 5 Winter 2012 12-18-2012
IR MEM[PC]
(Start state)
A R[rs]
B R[rt]
R A or ZX
R[rd] R
PC PC + 4
R[rt] R
PC PC + 4
To instruction fetch
LW
SW
R A + SX
R A + SX
M MEM[R]
MEM[R] B
PC PC + 4
R[rt] M
PC PC + 4
To instruction fetch
PC PC +
4+ SX || 00
To instruction fetch
13 states:
4 State Flip-Flops needed
Write-back
R A fun B
ORi
Memory
Execute
R-type
EECC550 - Shaaban
#17 Lec # 5 Winter 2012 12-18-2012
control points
Next State
Logic
Output
Logic
11
next
State
control points
Equal
6
Opcode
Current
State
State
op
To datapath
datapath State
State register (4 Flip-Flops)
EECC550 - Shaaban
#18 Lec # 5 Winter 2012 12-18-2012
EECC550 - Shaaban
#19 Lec # 5 Winter 2012 12-18-2012
instruction fetch
0000
imem_rd, IRen
A R[rs]
B R[rt]
Aen, Ben
0001
ALUfun, Sen
R-type
R A fun B
0100
SW
11
R A or ZX
R A + SX
0110
1000
R A + SX
M MEM[R]
1001
1011
R[rd] R
PC PC + 4
R[rt] R
PC PC + 4
0101
0111
0011
MEM[R] B
PC PC + 4
PC PC +
4+SX || 00
0010
To instruction fetch
state 0000
10
R[rt] M
PC PC + 4
1010
To instruction fetch state 0000
1100
PC PC + 4
12
RegDst,
RegWr,
PCen
LW
ORi
13 states:
4 State Flip-Flops needed
Write-back
Memory
Execute
EECC550 - Shaaban
#20 Lec # 5 Winter 2012 12-18-2012
Op field Z
Next IR
??????
BEQ
BEQ
R-type
orI
LW
SW
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
0001 1
0011
0010
0100
0110
1000
1011
0000
1
0000
1
0101
0000
1
0111
0000
1
1001
1010
0000
1
1100
0000
1
State
IF
ID
BEQ
ORI
LW
SW
0000
0001
0001
0001
0001
0001
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
?
0
1
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
PC
en sel
Ops
AB
Exec
Ex Sr ALU S
Mem
RWM
Write-Back
M-R Wr Dst
11
11
11
11
11
11
1
0
0
0 0 or
0
1 0 add 1
1 0 1
0
1 0 add 1
0
0 1
EECC550 - Shaaban
#21 Lec # 5 Winter 2012 12-18-2012
PCWr
ALUSrcA 1
RegWr
Mux
RegDst
32
PC
32
Din Dout
32
MemRd
Ra
busA A
Rb
Rd
busW busB
1
1 Mux 0
Imm 16
Extend
32
32
1
32
0
1
2
3
32
32
ALU
Control
<< 2
ALUOp
MemtoReg
Zero
32
Reg File
Rw
ALU Out
32
Rt 0
Mux
32 Rt
Ideal
Memory
Rs
32
ALU
Address
PC
Mux
Mux
32
Instruction Reg
32
32
PCSrc
ALUSrcB
EECC550 - Shaaban
#22 Lec # 5 Winter 2012 12-18-2012
PC
0
M
u
x
1
MemRead
MemWrite
Instruction
[2521]
Address
Memory
MemData
Write
data
IRWrite
Instruction
[2016]
Instruction
[150]
Instruction
register
Instruction
[150]
Memory
data
register
RegDst
ALUSrcA
RegWrite
rs
0
M
u
x
1
Read
register 1
Read
Read
data 1
register 2
Registers
Write
Read
register
data 2
rt
0
M
Instruction u
x
[1511]
1
16
Sign
extend
32
Shift
left 2
Zero
ALU
ALU
result
ALUOut
0
4
Write
data
rd
0
M
u
x
1
1 M
u
2 x
3
ALU
control
imm16
i.e MDR
Instruction [50]
MemtoReg
ALUSrcB ALUOp
EECC550 - Shaaban
#23 Lec # 5 Winter 2012 12-18-2012
2
PC+ 4
PC
32
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
32
EECC550 - Shaaban
#24 Lec # 5 Winter 2012 12-18-2012
RegDst
RegWrite
None
ALUSrcA
MemRead
None
MemWrite
None
MemtoReg
IorD
IRWrite
None
PCWrite
None
PCWriteCond None
i.e. Branch
EECC550 - Shaaban
#25 Lec # 5 Winter 2012 12-18-2012
Effect
Value (Binary)
00
01
10
00
The second input of the ALU comes from register B (i.e R[rs])
01
ALUOp
ALUSrcB
10
11
00
01
10
PCSource
EECC550 - Shaaban
#26 Lec # 5 Winter 2012 12-18-2012
ID
EX
Instruction
Fetch
Instruction
Decode
Execution
IR Mem[PC]
PC PC + 4
WB
Store
IR Mem[PC]
PC PC + 4
IR Mem[PC]
PC PC + 4
Branch
IR Mem[PC]
PC PC + 4
Jump
IR Mem[PC]
PC PC + 4
A R[rs]
A R[rs]
R[rs]
R[rs]
R[rs]
R[rt]
R[rt]
R[rt]
R[rt]
R[rt]
ALUout PC +
(SignExt(imm16)
x4)
ALUout PC +
ALUout
ALUout
A funct B
MEM
Load
(SignExt(imm16) x4)
ALUout PC +
ALUout PC +
(SignExt(imm16) x4)
Zero A - B
ALUout
A + SignEx(Imm16)
(SignExt(imm16) x4)
A + SignEx(Imm16)
ALUout PC +
(SignExt(imm16) x4)
PC Jump Address
Zero: PC ALUout
Memory
MDR Mem[ALUout]
Write
Back
R[rd] ALUout
R[rt]
Mem[ALUout]
MDR
EECC550 - Shaaban
#27 Lec # 5 Winter 2012 12-18-2012
6-7
(Figure 5.33)
(Figure 5.34)
0-1
9
(Figure 5.35)
(Figure 5.36)
EECC550 - Shaaban
#28 Lec # 5 Winter 2012 12-18-2012
IF
A R[rs]
ID
R[rt]
ALUout PC +
(SignExt(imm16) x4)
IR Mem[PC]
PC PC + 4
ALUout
A + SignEx(Imm16)
PC Jump Address
EX
ALUout A func B
Zero A -B
Zero: PC ALUout
MDR Mem[ALUout]
WB
MEM
R[rd] ALUout
Mem[ALUout] B
Total 10 states
R[rt]
MDR
WB
EECC550 - Shaaban
R[rs]
R[rt]
IF
IR Mem[PC]
PC PC + 4
(Figure 5.33)
(Figure 5.34)
ID
(Figure 5.35)
(Figure 5.36)
EECC550 - Shaaban
#30 Lec # 5 Winter 2012 12-18-2012
MemRead = 1
ALUSrcA = 0
ALUSrcB = 01 ALUOp = 00 (add)
IorD = 0
PCWrite = 1
IRWrite =1
PCSource = 00
32
00
2
1
01
PC+ 4
PC
32
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
00
Add
32
EECC550 - Shaaban
#31 Lec # 5 Winter 2012 12-18-2012
R[rs]
R[rt]
ALUSrcA = 0
ALUSrcB = 11
ALUOp = 00 (add)
32
2
11
PC
32
PC+ 4
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
00
Add
32
EECC550 - Shaaban
#32 Lec # 5 Winter 2012 12-18-2012
ALUout A + SignEx(Imm16)
EX
MDR Mem[ALUout]
MEM
R[rt]
Mem[ALUout] B
MDR
WB
To Instruction Fetch
(Figure 5.32)
EECC550 - Shaaban
#33 Lec # 5 Winter 2012 12-18-2012
ALUout A + SignEx(Imm16)
ALUSrcA = 1
ALUOp = 00 (add)
ALUSrcB = 10
32
2
10
PC
32
PC+ 4
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
00
Add
32
EECC550 - Shaaban
#34 Lec # 5 Winter 2012 12-18-2012
MemRead = 1
IorD = 1
32
2
1
PC+ 4
PC
32
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
32
EECC550 - Shaaban
#35 Lec # 5 Winter 2012 12-18-2012
MDR
RegWrite = 1
MemtoReg = 1
RegDst = 0
32
2
PC+ 4
PC
32
32
0
32
rs
Branch
Target
rt
rd
32
32
2
1
imm16
32
EECC550 - Shaaban
#36 Lec # 5 Winter 2012 12-18-2012
MemWrite = 1
IorD = 1
32
2
1
PC+ 4
PC
32
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
32
EECC550 - Shaaban
#37 Lec # 5 Winter 2012 12-18-2012
R-Type Instructions
FSM States
EX
ALUout A funct B
WB
R[rd] ALUout
EECC550 - Shaaban
#38 Lec # 5 Winter 2012 12-18-2012
ALUSrcA = 1
ALUSrcB = 00
ALUOp = 10 (R-Type)
32
2
00
PC
32
PC+ 4
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
10
R-Type
32
EECC550 - Shaaban
#39 Lec # 5 Winter 2012 12-18-2012
RegWrite = 1
MemtoReg = 0
RegDst = 1
32
2
PC+ 4
PC
32
32
1
32
rs
Branch
Target
rt
rd
32
32
2
0
imm16
32
EECC550 - Shaaban
#40 Lec # 5 Winter 2012 12-18-2012
Jump Instruction
Single EX State
Branch Instruction
Single EX State
Zero A - B
PC Jump Address
Zero : PC ALUout
EX
EX
EECC550 - Shaaban
#41 Lec # 5 Winter 2012 12-18-2012
ALUSrcA = 1
PCWriteCond = 1
ALUSrcB = 00
PCSource = 01
ALUOp = 01 (Subtract)
32
01
2
00
PC
32
PC+ 4
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
01
Subtract
32
EECC550 - Shaaban
#42 Lec # 5 Winter 2012 12-18-2012
PCWrite = 1
PCSource = 10
32
10
PC+ 4
PC
32
32
32
rs
Branch
Target
rt
rd
32
32
2
imm16
32
EECC550 - Shaaban
#43 Lec # 5 Winter 2012 12-18-2012
1 CPI 5
Frequency
CPIi x freqIi
Arith/Logic
40%
1.6
Load
30%
1.5
Store
10%
0.4
branch
20%
0.6
Average CPI:
4.1
T = I x CPI x C
EECC550 - Shaaban
#44 Lec # 5 Winter 2012 12-18-2012
R[rt] R[rs]
op
R[rs] R[rt]
rs rt
[31-26] [25-21]
[20-16]
rd
[10-6]
2
PC+ 4
rs
R[rs]
rt
Branch
Target
R[rt]
rd
2
3
imm16
The outputs of A and B should be connected to the multiplexor controlled by MemtoReg if one of the two fields
(rs and rd) contains the name of one of the registers being swapped. The other register is specified by rt.
The MemtoReg control signal becomes two bits.
EECC550 - Shaaban
#46 Lec # 5 Winter 2012 12-18-2012
ID
R[rt]
ALUout PC +
(SignExt(imm16) x4)
EX
ALUout
A + SignEx(Imm16)
WB1
R[rd] B
rd = rs
ALUout A func B
Zero A -B
Zero: PC ALUout
WB2
R[rt] A
R[rd] ALUout
A has R[rs]
MEM
WB
Swap takes 4 cycles
WB
EECC550 - Shaaban
#47 Lec # 5 Winter 2012 12-18-2012
You are to add support for a new instruction, add3, that adds the values of
three registers, to the MIPS multicycle datapath of Figure 5.28 on page 232
For example:
add3 $s0,$s1, $s2, $s3
Register $s0 gets the sum of $s1, $s2 and $s3.
The instruction encoding uses a modified R-format, with an additional register
specifier rx added replacing the five low bits of the funct field.
6 bits
[31-26]
5 bits
[25-21]
5 bits
[20-16]
5 bits
[15-11]
OP
rs
rt
rd
add3
$s1
$s2
$s0
6 bits
[10-5]
5 bits
[4-0]
rx
Not used
$s3
Add necessary datapath components, connections, and control signals to the multicycle
datapath without modifying the register bank or adding additional ALUs. Find a solution
that minimizes the number of clock cycles required for the new instruction. Justify the
need for the modifications, if any.
Show the necessary modifications to the multicycle control finite state machine of Figure
5.38 on page 339 when adding the add3 instruction. For each new state added, provide
the dependent RTN and active control signal values.
EECC550 - Shaaban
#48 Lec # 5 Winter 2012 12-18-2012
Modified
R-Format
op
rs rt
[31-26] [25-21]
[20-16]
rd
rx
[10-6]
[4-0]
2
WriteB
Re adSrc
rs
rt
PC+ 4
Branch
Target
rx
rd
imm16
1. ALUout is added as an extra input to first ALU operand MUX to use the previous ALU result as an input for the second addition.
2. A multiplexor should be added to select between rt and the new field rx containing register number of the 3rd operand
(bits 4-0 for the instruction) for input for Read Register 2.
This multiplexor will be controlled by a new one bit control signal called ReadSrc.
3. WriteB control line added to enable writing R[rx] to B
EECC550 - Shaaban
#49 Lec # 5 Winter 2012 12-18-2012
ID
R[rt]
ALUout PC +
(SignExt(imm16) x4)
EX
ALUout
WriteB
A + SignEx(Im16)
EX1
ALUout A + B
WriteB
B R[rx]
ALUout A func B
Zero A -B
Zero: PC ALUout
EX2
ALUout ALUout + B
R[rd] ALUout
MEM
WB
Add3 takes 5 cycles
WB
EECC550 - Shaaban
#50 Lec # 5 Winter 2012 12-18-2012