Professional Documents
Culture Documents
Pipeline: Introduction
Lecturer: Prof. Hong Jiang
Courtesy of Prof. Yifeng Zhu, U of Maine
Fall, 2006
CSCE430/830
Pipeline
Pipelining Outline
Introduction
Defining Pipelining
Pipelining Instructions
Hazards
Structural hazards
Data Hazards
Control Hazards
Performance
Controller implementation
CSCE430/830
Pipeline
What is Pipelining?
A way of speeding up execution of instructions
Key idea:
overlap execution of multiple instructions
CSCE430/830
Pipeline
CSCE430/830
Pipeline
If we do laundry sequentially...
6 PM 7
T
a
s
k
O
r
d
e
r
10
11
12
2 AM
B
C
D
Time Required: 8 hours for 4 loads
CSCE430/830
Pipeline
10
3030 30 30 30 3030
11
12
2 AM
Time
A
B
C
D
Time Required: 3.5 Hours for 4 Loads
CSCE430/830
Pipeline
3030 30 30 30 3030
A
B
C
D
CSCE430/830
10
11
12
2 AM
Time
Pipelining doesnt help latency of
single task, it helps throughput of
entire workload
Pipeline rate limited by slowest
pipeline stage
Multiple tasks operating
simultaneously
Potential speedup = Number pipe
stages
Unbalanced lengths of pipe stages
reduces speedup
Time to fill pipeline and time to
drain it reduces speedup
Pipeline
1ns
200ps
200ps
200ps
200ps
200ps
Pipeline
Register
CSCE430/830
Pipeline
Pipelined:
1 operation finishes
every 200ps
200ps
CSCE430/830
200ps
200ps
200ps
200ps
Pipeline
Limitations:
Computations must be divisible into stage size
Pipeline registers add overhead
CSCE430/830
Pipeline
Pipelining a Processor
CSCE430/830
Pipeline
CSCE430/830
Pipeline
Reg
Ifetch
Reg
Ifetch
Reg
DMem
Reg
Reg
DMem
ALU
Ifetch
DMem
ALU
O
r
d
e
r
Reg
ALU
I
n
s
t
r.
Ifetch
ALU
Reg
DMem
Reg
Pipeline
CSCE430/830
Pipeline
Pipeline example: lw
IF
CSCE430/830
Pipeline
Pipeline example: lw
ID
CSCE430/830
Pipeline
Pipeline example: lw
EX
CSCE430/830
Pipeline
Pipeline example: lw
MEM
CSCE430/830
Pipeline
Pipeline example: lw
WB
Pipeline
CSCE430/830
Pipeline
200
Instruction REG
Fetch
RD
lw$1,100($0)
lw$2,200($0)
400
600
ALU
MEM
800
REG
WR
800ps
1000
Instruction REG
Fetch
RD
lw$3,300($0)
Pipelined
Instruction
Order
lw$1,100($0)
lw$2,200($0)
lw$3,300($0)
1200
1400
ALU
MEM
800ps
1600
REG
WR
1800
Time
Instruction
Fetch
800ps
0
200
Instruction
Fetch
200ps
400
REG
RD
Instruction
Fetch
200ps
600
800
ALU
MEM
REG
RD
Instruction
Fetch
ALU
REG
RD
1000
1200
1400
1600
Time
REG
WR
MEM
ALU
REG
WR
MEM
REG
WR
Pipeline
Speedup
Consider the unpipelined processor introduced previously. Assume that
it has a 1 ns clock cycle and it uses 4 cycles for ALU operations and
branches, and 5 cycles for memory operations, assume that the relative
frequencies of these operations are 40%, 20%, and 40%, respectively.
Suppose that due to clock skew and setup, pipelining the processor
adds 0.2ns of overhead to the clock. Ignoring any latency impact, how
much speedup in the instruction execution rate will we gain from a
pipeline?
Average instruction execution time
= 1 ns * ((40% + 20%)*4 + 40%*5)
= 4.4ns
Speedup from pipeline
= Average instruction time unpiplined/Average instruction time pipelined
= 4.4ns/1.2ns = 3.7
CSCE430/830
Pipeline
CSCE430/830
Pipeline
Pipeline Hazards
Limits to pipelining: Hazards prevent next instruction
from executing during its designated clock cycle
Structural hazards: two different instructions use same h/w
in same cycle
Data hazards: Instruction depends on result of prior
instruction still in the pipeline
Control hazards: Pipelining of branches & other instructions
that change the PC
CSCE430/830
Pipeline
CSCE430/830
Pipeline
Pipelining Outline
Introduction
Defining Pipelining
Pipelining Instructions
Hazards
Structural hazards
Data Hazards
Control Hazards
Performance
Controller implementation
CSCE430/830
Pipeline