Professional Documents
Culture Documents
Assessing &
Understanding Performance
Major Goals
Measuring Performance
CPU time:
Throughput:
Total number of jobs completed per unit time
Computer administrators care about it
Job B
5s
5s
Overlap
Job A
3s
Job B
2s
3s
(overlap)
With Pipelining
Two Pipelines
Example:
Do the following changes to a computer system increase throughput,
decrease response time, or both?
Replacing the processor in a computer with a faster version
Both
Adding additional processors to a system
Throughput increases
Individual response time does not decrease
How about average response time when heavily loaded?
Decreasing response time almost always improves throughput
In real system, execution time and throughput often affect each other
We will focus primarily on execution time for a single job
1. Methodologies
performance X =
1
execution time X
Performance Comparison
10
n=
n=
Example:
Clock Cycles
11
Clock rate
Example
A 200 MHz clock has a cycle time of 1 / (200x106) = 5x10-9 = 5 ns
12
Wrong assumption
# of CPU clock cycles in a program = # of instructions in the program
Reasons
Execution times for different type of instructions may vary
Multiplication takes more time than addition
Floating-point operations take more time than integer ones
Accessing memory takes more time than accessing registers
Problems of using # of instructions as the metric
When comparing two machines with same ISA,
Same # of instructions do not imply same performance!
What should we use then? Answer: Cycles per instructions (CPI)
13
Meaning
Average # of clock cycles each instruction takes to execute in a program
Definition
CPI =
14
CPI = (CPIi C i )
i=1
Ci
i=1
15
Description
A
2
4
B
1
1
C
2
1
Problem to solve
Example - Answer
16
CPI1 =
CPI:
10
=2
5
CPI2 =
9
= 1.5
6
Remarks:
IRON LAW !
17
18
Time =
seconds
program
# of instructions
a program
# of clocks
# of instructions
seconds
clocks
CPI
Clock rate
19
Description
Two implementations of the same ISA, for a program
In computer A with clock cycle time 250ps, CPI is 2.0
In computer B with clock cycle time 500ps, CPI of 1.2
Question
Which computer is faster for this program, and by how much?
Answer (assume the program has I instructions)
20
21
22
MIPS =
instruction count
execution time 10 6
MIPS can vary inversely with performance
23
MFLOPS =
24
2. Benchmarks
25
26
Example:
Computer
Benchmark
1 sec
10 sec
20 sec
1000 sec
100 sec
20 sec
AM
500.5 sec
55 sec
20 sec
Weight
i =1
Weight
i =1
27
Timei
=1
Example:
Benchmark
Computer
Weight
W1
W2
10
20
0.91
0.999
1000
100
20
0.09
0.001
1001
110
40
AM
500.5
55
20
Weighted AM using W1
90.91
18.1
20
Weighted AM using W2
1.999
10.09
20
28
ratio
i =1
Example:
Benchmark
Speedup (ratios)
Computer
A/B
1 sec
10 sec
0.1
B/A
10
1000 sec
100 sec
10
0.1
AM
5.05
5.05
GM
1.0
1.0
29
10