Professional Documents
Culture Documents
Outline:
the ARM 3-stage pipeline the ARM7TDMI core the ARM 5-stage pipeline the ARM9TDMI core the ARM10TDMI core
fetch
the instruction is fetched from memory
decode
the instruction is decoded and the datapath control signals prepared for the next cycle
execute
the operands are read from the register bank, shifted, combined in the ALU and the result written back
Cores - v4 - 1
Cores - v4 - 2
fetch
decode
execute
3 instruction
fetch
decode
execute time
fetch STR
decode
fetch ADD
decode
execute
fetch ADD
decode
execute
5 instruction
execute
Cores - v4 - 3
Cores - v4 - 4
PC behaviour
r15 increments twice before an instruction executes
due to pipeline operation
ARM components:
register bank
2 read ports, 1 write port plus additional read and write ports for r15
barrel shifter ALU address register and incrementer memory data registers instruction decoder and control
Cores - v4 - 6
control
incrementer
Outline:
the ARM 3-stage pipeline the ARM7TDMI core the ARM 5-stage pipeline the ARM9TDMI core the ARM10TDMI core
barrel shifter
ALU
Cores - v4 - 7
Cores - v4 - 8
The ARM7TDMI
ARM7TDMI organization
scan chain 2 extern0 extern1 opc, r/w, mreq, trans, mas[1:0] A[31:0] D[31:0]
Embedded ICE
scan chain 0
processor core
scan chain 1
other signals
Din[31:0] Dout[31:0]
bus splitter
mclk wait eclk bigend irq fiq isync reset enin enout enouti abe ale ape dbe tbe busen highz busdis ecapclk dbgrq breakpt dbgack exec extern1 extern0 dbgen rangeout0 rangeout1 dbgrqi commrx commtx opc cpi cpa cpb Vdd Vss
A[31:0] Din[31:0] Dout[31:0] D[31:0] bl[3:0] r/w mas[1:0] mreq seq lock trans mode[4:0] abort Tbit
interrupts
memory interface
ARM7TDMI
initialization
bus control
ARM7TDMI core
tapsm[3:0] ir[3:0] tdoen tck1 tck2 screg[3:0] drivebs ecapclkbs icapclkbs highz pclkbs rstclkbs sdinbs sdoutbs shclkbs shclk2bs TRST TCK TMS TDI TDO
TAP information
debug
ARM7TDMI characteristics:
0.35 m 3 3.3 V Transistors Core area Clock 74,209 2 2.1 mm 0 to 66 MHz MIPS Power MIPS/W 60 87 mW 690
JTAG controls
Cores - v4 - 11
Cores - v4 - 12
ARM7TDMI
Outline:
the ARM 3-stage pipeline the ARM7TDMI core the ARM 5-stage pipeline the ARM9TDMI core the ARM10TDMI core
Cores - v4 - 13
Cores - v4 - 14
Execute
shift and ALU
Memory
data memory access
Write-back
Cores - v4 - 16
Cores - v4 - 15
ARM9TDMI
The ARM9TDMI is
a classic Harvard architecture 5-stage pipeline
separate instruction and data memory ports
with full support for Thumb and EmbeddedICE debug aimed at significantly higher performance than the ARM7TDMI
enhanced pipeline operates at 100-200 MHz
2001 PEVEIT Unit - ARM System Design Cores - v4 - 18
Cores - v4 - 17
ARM9TDMI pipeline
ARM7TDMI: Fetch
instruction fetch
ARM9TDMI pipeline
Execute
reg read shift/ALU reg write
next pc
+4 I-cache fetch
pc + 4
pc + 8 r15
Decode
Thumb decompress ARM decode
register read
mul +4
postindex
shift ALU
reg shift
pre-index
execute
forwarding paths
ARM9TDMI:
instr uction fetch r. read decode shift/ALU data memor y access reg write
mux
B, BL MOV pc SUBS pc
Fetch
Decode
Execute
Memory
Write
load/store address
D-cache
rot/sgn ex
LDR pc
register write
write-back
Cores - v4 - 20
ARM9TDMI
ARM9TDMI
EmbeddedICE
as ARM7TDMI, plus:
hardware single-stepping breakpoints on exceptions
Cores - v4 - 21
Cores - v4 - 22
ARM10TDMI
ARM10TDMI pipeline
branch prediction instruction fetch r. read decode addr. calc. shift/ALU multiply data memory access multiplier par tials add data write reg write
The ARM10TDMI is
aimed at significantly higher performance than the ARM9TDMI achieved through use of:
higher clock rate 64-bit I- and D-memory buses branch prediction hit-under-miss D-memory interface
decode
Fetch
Issue
Decode
Execute
Memory
Write
6-stage pipeline
2001 PEVEIT Unit - ARM System Design Cores - v4 - 23 2001 PEVEIT Unit - ARM System Design Cores - v4 - 24