You are on page 1of 39

PROCESSOR:

DATAPATH & CONTROL - 2


Dr. Bill Yi
Santa Clara University
(Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design:
The Hardware/Software Interface, 3rd Ed., Morgan Kaufmann, 2007)
(Also based on presentation: Dr. Nam Ling, COEN210 Lecture Notes)
1
COURSE CONTENTS
„ Introduction
„ Instructions

„ Computer Arithmetic

„ Performance

Î Processor: Datapath

Î Processor: Control

„ Pipelining Techniques

„ Memory

„ Input/Output Devices

2
PROCESSOR:
DATAPATH & CONTROL

„ Multi-Cycle Datapath & Control


„ Control: Finite State Machine (FSM)
„ Control: Microprogramming

3
Multicycle Approach
„ Break up an instruction into steps, each step takes a cycle:
„ balance the amount of work to be done
„ restrict each cycle to use only one major functional unit
„ Different instructions take different number of cycles to complete
„ At the end of a cycle:
„ store values for use in later cycles (easiest thing to do)
„ introduce additional “internal” registers for such temporal
storage

„ Reusing functional units (reduces hardware cost):


„ Use ALU to compute address/result and to increment PC
„ Use memory for both instructions and data

4
Multi-Cycle Datapath:
Additional Registers
„ Additional “internal registers”:
Î Instruction register (IR) -- to hold current instruction
Î Memory data register (MDR) -- to hold data read from memory
Î A register (A) & B register (B) -- to hold register operand values from register files
Î ALUOut register (ALUOut) -- to hold output of ALU, also serves as memory address register
(MAR)
„ All registers except IR hold data only between a pair of adjacent cycles and thus do not need write
control signals; IR holds instructions till end of instruction, hence needs a write control signal

PC 0 0
M Instruction Read
Address [25– 21] register 1 M
u u
x Read x
Instruction Read A Zero
1 Memory [20– 16] register 2 data 1 1
Inst /Data 0 ALU ALU ALUOut
Registers
Instruction M Write Read result
[15– 0] Instruction u register data 2 B 0
Write Instruction [15– 11] x 4 1 M
data 1 Write u
register data 2 x
Instruction 0 3
[15– 0] M
u
x
Memory 1
data 16 32
Sign Shift
register
extend left 2

Note: we ignore jump inst here


5
Multicycle Datapath:
Additional Multiplexors
„ Additional multiplexors:
Î Mux for first ALU input -- to select A or PC (since we use ALU for both address/result
computation & PC increment)
Î Bigger mux for second ALU input -- due to two additional inputs: 4 (for normal PC increment)
and the sign-extended & shifted offset field (in branch address computation)
Î Mux for memory address input -- to select instruction address or data address

PC 0 0
M Instruction Read
Address [25– 21] register 1 M
u u
x Read x
Instruction Read A Zero
1 Memory [20– 16] register 2 data 1 1
Inst /Data 0 ALU ALU ALUOut
Registers
Instruction M Write Read result
[15– 0] Instruction u register data 2 B 0
Write Instruction [15– 11] x 4 1 M
data 1 Write u
register data 2 x
Instruction 0 3
[15– 0] M
u
x
Memory 1
data 16 32
register Sign Shift
extend left 2

Note: we ignore jump inst here 6


Multi-Cycle
Datapath & Control
2
2

Note the reason for each control signal; also note that we have included the jump instruction
7
Control Signals for
Multi-Cycle Datapath
„ Note:
Î three possible sources for value to be written into PC (controlled by
PCSource): (1) regular increment of PC, (2) conditional branch target from
ALUOut, (3) unconditional jump (lower 26 bits of instruction in IR shifted
left by 2 and concatenated with upper 4 bits of the incremented PC)
Î two PC write control signals: (1) PCWrite (for unconditional jump), & (2)
PCWriteCond (for “zero” signal to cause a PC write if asserted during beq
inst.)
Î since memory is used for both inst. & data, need IorD to select appropriate
addresses
Î IRWrite needed for IR so that instruction is written to IR (IRWrite = 1)
during the first cycle of the instruction and to ensure that IR not be
overwritten by another instruction during the later cycles of the current
instruction execution (by keeping IRWrite = 0)
Î other control signals

8
Breaking the Instruction
into 3 - 5 Execution Steps
1. Instruction Fetch (All instructions)
2. Instruction Decode (All instructions), Register Fetch & Branch Address
Computation (in advance, just in case)
3. ALU (R-type) execution, Memory Address Computation, or Branch
Completion (Instruction dependent)
4. Memory Access or R-type Instruction Completion (Instruction dependent)
5. Memory Read Completion (only for lw)

At end of every clock cycle, needed data must be stored into register(s) or memory
location(s).
Each step (can be several parallel operations) is 1 clock cycle --> Instructions take 3
to 5 cycles!
Events during a cycle, e.g.: Clock

Data ready operation Clock in result


9
Step 1: Instruction Fetch
„ Use PC to get instruction (from memory) and put it in the Instruction Register
„ Increment of the PC by 4 and put the result back in the PC
„ Can be described succinctly using RTL "Register-Transfer Language"

IR <= Memory[PC];
PC <= PC + 4;

„ Which control signals need to be asserted?


Î IorD = 0, MemRead = 1, IRWrite = 1
Î ALUSrcA = 0, ALUSrcB = 01, ALUOp = 00, PCWrite = 1, PCSource = 00

„ Why can instruction read & PC update be in the same step? Look at state element
timing

„ What is the advantage of updating the PC now?

10
Step 2: Instruction Decode, Reg.
Fetch, & Branch Addr. Comp.

„ In this step, we decode the instruction in IR (the opcode enters control


unit in order to generate control signals). In parallel, we can
„ Read registers rs and rt, just in case we need them
„ Compute the branch address, just in case the instruction is a branch beq
„ RTL:

A <= Reg[IR[25:21]];
B <= Reg[IR[20:16]];
ALUOut <= PC + (sign-extend(IR[15:0]) << 2);

„ Control signals:
Î ALUSrcA = 0, ALUSrcB = 11, ALUOp = 00 (add)
Î Note: no explicit control signals needed to write A, B, & ALUOut. They are
written by clock transitions automatically at end of step

11
Step 3: Instruction
Dependent Operation
„ One of four functions, based on instruction type:

„ Memory address computation (for lw, sw):


ALUOut <= A + sign-extend(IR[15:0]);
Î Control signals: ALUSrcA = 1, ALUSrcB = 10, ALUOp = 00
„ ALU (R-type):
ALUOut <= A op B;
Î Control signals: ALUSrcA = 1, ALUSrcB = 00, ALUOp = 10
„ Conditional branch:
if (A==B) PC <= ALUOut;
Î Control signals: ALUSrcA = 1, ALUSrcB = 00, ALUOp = 01 (Sub), PCSource = 01,
PCWriteCond = 1 (to enable zero to write PC if 1)
What is the content of ALUOut during this step? Immediately after this step?
„ Jump:
PC <= PC[31:28] || (IR[25:0]<<2);
Î Control signals: PCSource = 10, PCWrite = 1

 Note: Conditional branch & jump instructions completed at this step!


12
Step 4: Memory Access or ALU
(R-type) Instruction Completion
„ For lw or sw instructions (access memory):

MDR <= Memory[ALUOut];


or
Memory[ALUOut] <= B;

Î Control signals (for lw): IorD = 1 (to select ALUOut as address), MemRead = 1, note
that no write signal needed for writing to MDR, it is written by clock transition automatically at end
of step
Î Control signals (for sw): IorD = 1 (to select ALUOut as address), MemWrite = 1

„ For ALU (R-type) instructions (write result to register):

Reg[IR[15:11]] <= ALUOut;

Î Control signals: RegDst = 1 (to select register address), MemtoReg = 0, RegWrite = 1

+ The write actually takes place at the end of the cycle on the clock edge!
 Note: sw and ALU (R-type) instructions completed at this step!
13
Step 5: Memory Read
Completion
„ For lw instruction only (write data from MDR to register):

Reg[IR[20:16]]<= MDR;

„ Control signals: RegDst = 0 (to select register address), MemtoReg =


1, RegWrite = 1

 Note: lw instruction completed at this step!

14
Summary of Execution Steps

Action for R-type Action for memory-reference Action for Action for
Step name instructions instructions branches jumps
Instruction fetch IR <= Memory[PC]
PC <= PC + 4
Instruction A <= Reg [IR[25:21]]
decode/register fetch B <= Reg [IR[20:16]]
/branch addr comp ALUOut <= PC + (sign-extend (IR[15:0]) << 2)
Execution, address ALUOut <= A op B ALUOut <= A + sign-extend if (A ==B) then PC <= PC [31:28]
computation, branch/ (IR[15:0]) PC <= ALUOut II (IR[25:0]<<2)
jump completion
Memory access or R-type Reg [IR[15:11]] <= Load: MDR <= Memory[ALUOut]
completion ALUOut or
Store: Memory [ALUOut] <= B
Memory read completion Load: Reg[IR[20:16]] <= MDR

Some instructions take shorter number of cycles, therefore next instructions can start earlier.
Hence, compare to single-cycle implementation where all instructions take same amount of time, multi-cycle
implementation is faster!
Multi-cycle implementation also reduces hardware cost (reduces adders & memory, increases number of
registers & muxes).

15
Simple Questions
„ How many cycles will it take to execute this code?

lw $t2, 0($t3)
lw $t3, 4($t3)
beq $t2, $t3, Label #assume not
add $t5, $t2, $t3
sw $t5, 8($t3)
Label: ...

„ What is going on during the 8th cycle of execution?


„ In what cycle does the actual addition of $t2 and $t3 takes place?

16
Defining the Control for
Multi-Cycle Datapath
„ Multi-cycle vs single-cycle datapath:
„ for single-cycle, truth-tables to specify setting of control signals based on
instruction
„ for multi-cycle, control is more complex due to instruction is executed in steps;
control must specify both the control signals in any step & the next step in the
sequence
„ Value of control signals dependent upon:
„ what instruction is being executed
„ which step is being performed

„ Two different control techniques:


Î Finite state machine (FSM)

Î Microprogramming

„ Implementation can be derived from specification

17
Finite State Machine
(FSM) Control

„ Consists of set of states & directions on how to change


states
„ Each state specifies a set of control signal outputs that are
asserted when machine is at that state
„ Each state in FSM takes 1 clock cycle
„ First two states (state 0 & state 1) common for all
instructions
„ After state 1, signals asserted depend on instruction (this
process is called instruction decoding)
„ After last step (state) of an instruction, FSM returns to state
0 to begin fetching next instruction

18
The Complete FSM Control
Instruction decode/
Instruction fetch register fetch
0
MemRead 1
Graphical specification: ALUSrcA = 0
IorD = 0 ALUSrcA = 0
Start IRWrite ALUSrcB = 11
ALUSrcB = 01 ALUOp = 00
ALUOp = 00
PCWrite
PCSource = 00
e)

')
-t yp

EQ

(Op = 'J')
=R

'B
(O p

=
Memory address W ')

p
p = 'S Branch

(O
computation O Jump
r ( Execution completion
') o completion
= 'LW
2 (Op 6 8 9
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA =1 ALUSrcB = 00
ALUSrcB = 10 ALUOp = 01 PCWrite
ALUSrcB = 00 PCSource = 10
ALUOp = 00 ALUOp = 10 PCWriteCond
PCSource = 01
(O
(Op = 'LW')

p
=
'S
W
')

Memory Memory
access access R-type completion
3 5 7
RegDst = 1
MemRead MemWrite RegWrite
IorD = 1 IorD = 1 MemtoReg = 0

Write-back step
4

RegDst = 0
RegWrite
MemtoReg = 1

19
CPI in Multi-Cycle CPU
Example:
load store R-type branch jumps
(cond.)
gcc instruction 22% 11% 49% 16% 2%
mix
#cycles 5 4 4 3 3

CPI = 0.22 x 5 + 0.11 x 4 + 0.49 x 4 + 0.16 x 3 + 0.02 x 3


= 1.1 + 0.44 + 1.96 + 0.48 + 0.06 = 4.04
Better than worst case CPI (if all instructions took same number of cycles = 5)

20
FSM Controller
Implementation
„ Typically by a block of combinational logic & a state register to hold the current state
PCWrite
Total of 9 states --> 4 bit state register PCWriteCond
Combinational control logic: IorD
MemRead
Inputs: current state & any input used to MemWrite
determine the next state (in this case is 6-bit IRWrite
Combinational MemtoReg
opcode) control logic
PCSource
Outputs: next state number & control ALUOp
signals to be asserted for current state Outputs ALUSrcB
Note: here outputs depend only on current ALUSrcA
RegWrite
state, not on inputs (Moore machine)
RegDst
NS3
NS2
NS1
Inputs NS0

Op1

Op0
Op5

Op4

Op3

Op2

S3

S2

S1

S0
Instruction register State register
opcode field

21
PLA Implementation of the
Combinational Control Logic
Op5
„ If I picked a horizontal or a
Op4
vertical line, could you explain it?
Op3
„ Note: upper half is AND plane &
Op2
lower half is OR plane
Op1
Example: PCWrite = 1 if (current state is Op0
state 0) or (current state is state 9), i.e.,
S3
S2
PCWrite = S 3 ⋅ S 2 ⋅ S1 ⋅ S 0 + S 3 ⋅ S 2 ⋅ S1 ⋅ S 0 S1
S0
Example: next state bit 2 NS2 = 1 (i.e. states
4, 5, 6, or 7) if (current state is 3) or (current PCWrite
PCWriteCond
state is 2 and op = 101011 (sw)) or (current IorD
state is 1 and op = 000000 (R-type)) or MemRead
MemWrite
(current state is 6), I.e. IRWrite
MemtoReg
PCSource1
PCSource0
ALUOp1
NS 2 = S 3 ⋅ S 2 ⋅ S1⋅ S 0 + ALUOp0
ALUSrcB1
S 3 ⋅ S 2 ⋅ S1⋅ S 0 ⋅ Op5 ⋅ Op 4 ⋅ Op3 ⋅ Op 2 ⋅ Op1 ⋅ Op0 + ALUSrcB0
ALUSrcA
RegWrite
S 3 ⋅ S 2 ⋅ S1⋅ S 0 ⋅ Op5 ⋅ Op 4 ⋅ Op3 ⋅ Op 2 ⋅ Op1⋅ Op0 + RegDst
NS3
S 3 ⋅ S 2 ⋅ S1 ⋅ S 0 NS2
NS1
NS0 22
ROM Implementation of
Combinational Control Logic
„ Combinational control logic can be express in a truth table: inputs are current
state values (S3 - S0) & Opcodes (Op5 - Op0); outputs are control signals &
next state values (NS3 - NS0)
„ A ROM can be used to implement a truth table
„ if the address (inputs) is m-bits, we can address 2m entries in the ROM
„ outputs are the bits of data that the address points to

Example: address data


0 0 0 0 0 1 1
0 0 1 1 1 0 0
m n
0 1 0 1 1 0 0
ROM 0 1 1 1 0 0 0
1 0 0 0 0 0 0
1 0 1 0 0 0 1
1 1 0 0 1 1 0
1 1 1 0 1 1 1

23
ROM Implementation of
Combinational Control Logic
„ How many inputs are there?
6 bits for opcode, 4 bits for current-state = 10 address lines
(i.e., 210 = 1024 different addresses)
„ How many outputs are there?
16 datapath-control outputs, 4 next-state bits = 20 bit outputs

„ ROM is 210 x 20 = 20K bits (and a rather unusual size)

„ Rather wasteful, since lots of input combinations (addresses) will never


occur — e.g. many opcodes are illegal, some states (e.g. states 10 to
15) are illegal

24
ROM vs. PLA
„ Break up the table into two parts
— 4 state bits tell you the 16 outputs, 24 x 16 bits of ROM
— 10 bits tell you the 4 next state bits, 210 x 4 bits of ROM + small circuit
— Total: 4.3K bits of ROM + small circuit
„ PLA is much smaller
— can share product terms
— only need entries that produce an active output
— can take into account don't cares
„ Size is (#inputs × #product-terms) + (#outputs × #product-terms)
For this example, PLA size prop. to = (10x17)+(20x17) = 510 PLA cells

„ PLA cells usually about (slightly bigger) the size of a ROM cell (bit)
„ PLA is a much more efficient implementation for this control unit

25
Microprogramming Control
„ If the assembly language instruction set becomes very large, FSM could require
hundreds to thousands of states & many arcs (sequences) -- very complex
Î Complex control better managed by microprogramming

„ Basic idea:
„ All control signals in a cycle form a microinstruction, each microinst. defines:
„ the set of datapath control signals that must be asserted in a given state (cycle)
„ next microinstruction
„ Executing a microinstruction = asserting the control signals specified
„ A sequence of microinstructions form a microprogram
„ Each cycle, a microinstruction is fetched from the microprogram & executed
„ Microprogramming -- designing the control as a program implementing machine
instructions by simpler microinstructions
„ Each control state corresponds to a microinstruction
„ Our basic FSM: 10 states → 10 micro-instructions

26
Microinstruction Format
„ A microinstruction contains several fields + 1 label
„ Each field specifies a non-overlapping set of control signals
„ Signals that are never asserted simultaneously may share the same field
„ A last field specifies how to choose the next microinstruction
„ Label: some micro-instructions have a label to be branched at
„ In our example, we have 7 fields + 1 label
„ 1st to 6th fields: control specification; 7th field: next instruction

Field name Control signals


1. ALU control Define operation of ALU
2. SRC1 Specify source for 1st ALU operand
3. SRC2 Specify source for 2nd ALU operand
4. Register control Specify read or write for register file, and source of
value for a write
5. Memory Specify read or write, and the source for memory.
For a read, specify destination register
6. PCWrite control Specify the writing of PC
7. Sequencing Specify how to choose next microinstruction

27
A Microprogram
Control Unit
„ Microinstructions are placed in a
PCWrite
ROM or PLA Control unit
PCWriteCond
IorD
„ The state (in state register) enters MemRead
PLA or ROM
as input or address to define the MemWrite
IRWrite
current microinstruction, which in BWrite
MemtoReg
turn asserting relevant control Outputs
PCSource 2

signals ALUOp
ALUSrcB
2
2

„ State change at the edge of clock ALUSrcA


RegWrite
„ Sequencing: ways to choose next RegDst
AddrCtl
Input
microinstruction (next state): 1
Î increment current address/state
(AddrCtl selects +1 adder) (Seq) State
Adder
Î branch to microinstruction that
begins execution of the next MIPS Address select logic
instruction (AddrCtl selects address

Op[5–0]
0) (Fetch)
Î choose next microinstruction based
on opcode (AddrCtl selects dispatch Instruction register
table) (Dispatch) opcode field
28
A Review of Our
State Diagram
Instruction decode/
Instruction fetch register fetch
0
Graphical specification:
MemRead 1
ALUSrcA = 0
IorD = 0 ALUSrcA = 0
Start IRWrite ALUSrcB = 11
ALUSrcB = 01 ALUOp = 00
ALUOp = 00
PCWrite
PCSource = 00
e)

)
-t yp

Q'

(Op = 'J')
E
=R

'B
(Op

=
Memory address W ')

p
p = 'S Branch

(O
computation O Jump
(
') or Execution completion completion
= 'LW
2 (Op 6 8 9
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA =1 ALUSrcB = 00
ALUSrcB = 10 ALUOp = 01 PCWrite
ALUSrcB = 00 PCSource = 10
ALUOp = 00 ALUOp = 10 PCWriteCond
PCSource = 01
(O
(Op = 'LW')

p
=
'S
W
')

Memory Memory
access access R-type completion
3 5 7
RegDst = 1
MemRead MemWrite RegWrite
IorD = 1 IorD = 1 MemtoReg = 0

Write-back step
4

RegDst = 0
RegWrite
MemtoReg = 1

29
Sequencing:
Address Select Logic
PLA or ROM

Dispatch ROM 1 1
Op Opcode name Value
State
000000 R-format 0110
000010 jmp 1001 Adder
000100 beq 1000 Mux AddrCtl
100011 lw 0010 3 2 1 0
101011 sw 0010
0

Dispatch ROM 2 Dispatch ROM 2 Dispatch ROM 1


Op Opcode name Value
100011 lw 0011 Address select logic

Op
101011 sw 0101
Instruction register
opcode field

State number Address-control action Value of AddrCtl


0 Use incremented state 3
1 Use dispatch ROM 1 1
2 Use dispatch ROM 2 2
3 Use incremented state 3
4 Replace state number by 0 0
5 Replace state number by 0 0
6 Use incremented state 3
7 Replace state number by 0 0
8 Replace state number by 0 0
9 Replace state number by 0 0 30
A Microprogram
Control Unit
Control unit PCWrite
„ A microprogram PCWriteCond
IorD
control unit MemRead
Microcode memory Datapath
controlling the MemWrite
IRWrite
datapath BWrite
Î ROM or PLA is now Outputs MemtoReg
PCSource
microcode memory ALUOp
(control memory) ALUSrcB
ALUSrcA
Î state register is RegWrite
RegDst
now microprogram AddrCtl
Input
counter (µPC)
1

Microprogram counter
Microcode Adder
storage Address select logic
Op[5–0]

Sequencer

Instruction register
opcode field
31
A Review of
Datapath & Control
2
2

32
Note the reason for each control signal; also note that we have included the jump instruction
A Review of the Instruction
Execution Steps

1. IR <= Memory[PC]; PC <= PC + 4; (State 0)

2. Instruction Decode (All instructions);


A <= Reg[IR[25:21]]; B <= Reg[IR[20:16]]; (State 1)
ALUOut <= PC + (sign-extend(IR[15:0]) << 2);

3. Memory address computation (for lw, sw):


ALUOut <= A + sign-extend(IR[15:0]); (State 2)
ALU (R-type): ALUOut <= A op B; (State 6)
Conditional branch: if (A==B) then PC <= ALUOut; (State 8)
Jump: PC <= PC[31:28] || (IR[25:0]<<2); (State 9)

4. For lw or sw instructions (access memory):


MDR <= Memory[ALUOut] (State 3) or Memory[ALUOut] <= B; (State 5)
For ALU (R-type) instructions (write result to register): Reg[IR[15:11]] <= ALUOut; (State 7)

5. For lw instruction only (write data from MDR to register): Reg[IR[20:16]]<= MDR; (State 4)

33
A Symbolic Microprogram
„ A specification methodology
„ appropriate if hundreds of opcodes, modes, cycles, etc.
„ signals specified symbolically using microinstructions
„ E.g. Read PC = Read memory using PC as address and write result into IR (& MDR) (see
next slide for details)
„ Our symbolic microprogram with 10 microinstructions:
ALU Register PCWrite
Label control SRC1 SRC2 control Memory control Sequencing
Fetch Add PC 4 Read PC ALU Seq
Add PC Extshft Read Dispatch 1
Mem1 Add A Extend Dispatch 2
LW2 Read ALU Seq
Write MDR Fetch
SW2 Write ALU Fetch
Rformat1 Func code A B Seq
Write ALU Fetch
BEQ1 Subt A B ALUOut-cond Fetch
JUMP1 Jump address Fetch

Microassembler: performs checks to remove combinations that cannot be supported in datapath

34
Control Signals for Each Symbol
in Each Field in the Microprogram
Field name Value Signals active Comment
Add ALUOp = 00 Cause the ALU to add.
ALU control Subt ALUOp = 01 Cause the ALU to subtract; this implements the compare for
branches.
Func code ALUOp = 10 Use the instruction's function code to determine ALU control.
SRC1 PC ALUSrcA =0 Use the PC as the first ALU input.
A ALUSrcA =1 Register A is the first ALU input.
B ALUSrcB = 00 Register B is the second ALU input.
SRC2 4 ALUSrcB = 01 Use 4 as the second ALU input.
Extend ALUSrcB = 10 Use output of the sign extension unit as the second ALU input.
Extshft ALUSrcB = 11 Use the output of the shift-by-two unit as the second ALU input.
Read Read two registers using the rs and rt fields of the IR as the register
numbers and putting the data into registers A and B.
Write ALU RegWrite = 1, Write a register using the rd field of the IR as the register number and
Register RegDst = 1, the contents of the ALUOut as the data.
control MemtoReg = 0
Write MDR RegWrite = 1, Write a register using the rt field of the IR as the register number and
RegDst = 0, the contents of the MDR as the data.
MemtoReg = 1
Read PC MemRead = 1, Read memory using the PC as address; write result into IR (and
IorD = 0, IRWrite=1 the MDR).
Memory Read ALU MemRead = 1, Read memory using the ALUOut as address; write result into MDR.
lorD = 1
Write ALU MemWrite = 1, Write memory using the ALUOut as address, contents of B as the
lorD = 1 data.
ALU PCSource = 00 Write the output of the ALU into the PC.
PCWrite = 1
PC write control ALUOut-cond PCSource = 01, If the Zero output of the ALU is active, write the PC with the contents
PCWriteCond = 1 of the register ALUOut.
jump address PCSource = 10, Write the PC with the jump address from the instruction.
PCWrite = 1
Seq AddrCtl = 11 Choose the next microinstruction sequentially.
Sequencing Fetch AddrCtl = 00 Go to the first microinstruction to begin a new instruction.
Dispatch 1 AddrCtl = 01 Dispatch using the ROM 1.
Dispatch 2 AddrCtl = 10 Dispatch using the ROM 2. 35
Maximally vs Minimally
Encoded
„ No encoding of control signals in microinstruction format (horizontal microprogram):
„ 1 bit for each control signal in datapath operation; e.g. control signals s, t, u, v, w, x, y, z will
occupy 8 bits in microinstruction
„ faster, but requires more memory (logic)
„ used for Vax 780 — an astonishing 400K of control memory!

„ Lots of encoding of control signals in microinstruction format (vertical microprogram):


„ E.g. s, t, u, v, w, x, y, z will be encoded in say, 4 bits, with 0000 meaning u = 1 (others = 0),
1010 meaning u = w = 1 (others = 0), etc. I.e. all possible combinations are encoded
„ send the microinstructions through logic to get control signals
„ uses less memory, but slower

„ Select a good trade-off

„ Microcode implementation: on-chip vs off-chip


36
Exceptions
„ Exception: unexpected event from within the processor (e.g. arithmetic overflow)
„ Interrupt: “unexpected” event from outside of the processor (e.g. from an I/O device)
I/O device request External Interrupt
Invoke OS from user program Internal Exception
Arithmetic overflow Internal Exception
Using undefined instruction Internal Exception
Hardware malfunctions Either Exception or interrupt

„ An exception or an interrupt causes an unexpected change in control flow: How does the
control unit handle an exception/interrupt?
Î In case of an exception, processor should:
Î save address of the offending instruction in exception program counter (EPC)
Î indicate the reason for exception in Cause register (status register)
Î transfer control to operating system at some specified address (the OS can then provide some
service: taking predefined action in response to overflow or stopping the program & reporting an
error). If OS continues program execution, it uses EPC to determine where to restart
Î Another way is vectored interrupts:
Î the address to which control is transferred is determined by cause of the exception
37
Exceptions Handling
by Control Unit
Î Control unit:
Î two more control signals: EPCWrite & CauseWrite; also IntCause
Î modify the mux to PC to 4-way mux to allow exception address to PC (the
exception address is OS entry point for exception handling, and is 8000 0180hex for
MIPS)
Î To handle two types of exceptions: undefined instruction & arithmetic
overflow
Î add two states in state diagram to do the above: one when no state is defined for
the op value at state 1 (then → state 10), the other when overflow is detected
from ALU in state 7 (then → state 11)

38
Chapter Summary
„ Part 1:
„ Elements of datapath: instruction subset, resources, clocking method
„ Datapath for different instruction classes
„ Building single-cycle datapath: multiplexors, functional units, control signals
„ Single-cycle datapath control unit logic: ALU control, main control
„ Single-cycle datapath & control: complete picture, critical path, problems

„ Part 2:
„ Multi-cycle datapath: approach, additional registers & multiplexors, control signals
„ Breaking instructions into execution steps
„ Multi-cycle datapath & control: complete picture
„ Finite state machine (FSM) (hardwired) control & controller implementation
„ Microprogramming: control, microinstruction format, controller implementation,
symbolic microprogram & its control signals, issues
„ Exception Handling

39

You might also like