Professional Documents
Culture Documents
Performance
Performance measures
Speedup laws
Scalability principles
Scaling up vs. scaling down
EENG-630 Chapter 3
Parallelism profiles
Asymptotic speedup factor
System efficiency, utilization and quality
Standard performance measures
EENG-630 Chapter 3
Degree of parallelism
Reflects the matching of software and
hardware parallelism
Discrete time function measure, for each
time period, the # of processors used
Parallelism profile is a plot of the DOP as a
function of time
Ideally have unlimited resources
EENG-630 Chapter 3
Algorithm structure
Program optimization
Resource utilization
Run-time conditions
Realistically limited by # of available
processors, memory, and other
nonprocessor resources
EENG-630 Chapter 3
EENG-630 Chapter 3
Average parallelism
Total amount of work performed is
proportional to the area under the profile
curve
t2
W DOP(t )dt
t1
W i ti
i 1
EENG-630 Chapter 3
Average parallelism
1 t2
A
DOP
(
t
)
dt
t 2 t1 t 1
m
m
A i ti / ti
i 1
i 1
EENG-630 Chapter 3
EENG-630 Chapter 3
Asymptotic speedup
m
Wi
T (1) ti (1)
i 1
i 1
m
Wi
T ( ) ti ( )
i 1
i 1 i
T (1)
S
T ( )
i 1
m
W / i
i
i 1
(response time)
EENG-630 Chapter 3
Performance measures
Consider n processors executing m
programs in various modes
Want to define the mean performance of
these multimode computers:
Arithmetic mean performance
Geometric mean performance
Harmonic mean performance
EENG-630 Chapter 3
10
Ra Ri / m
i 1
m
R ( f i Ri )
*
a
i 1
11
Rg R
i 1
m
1/ m
i
R Ri
*
g
i 1
fi
12
1 m
1 m 1
Ta Ti
m i 1
m i 1 Ri
EENG-630 Chapter 3
13
m
m
(1 / R )
i 1
R
*
h
1
m
( f
i 1
/ Ri )
-corresponds to total # of operations divided by
the total time (closest to the real performance)
EENG-630 Chapter 3
14
S T1 / T
*
i 1
f i / Ri
EENG-630 Chapter 3
15
EENG-630 Chapter 3
16
Amdahls Law
Assume Ri = i, w = (, 0, 0, , 1- )
System is either sequential, with probability
, or fully parallel with prob. 1-
n
Sn
1 (n 1)
Implies S 1/ as n
EENG-630 Chapter 3
17
Speedup Performance
EENG-630 Chapter 3
18
System Efficiency
O(n) is the total # of unit operations
T(n) is execution time in unit time steps
T(n) < O(n) and T(1) = O(1)
S ( N ) T (1) / T (n)
S (n) T (1)
E ( n)
n
nT (n)
EENG-630 Chapter 3
19
O ( n)
U ( n) R ( n) E ( n)
nT
(
n
)
EENG-630 Chapter 3
20
Quality of Parallelism
Directly proportional to the speedup and
efficiency and inversely related to the
redundancy
Upper-bounded by the speedup S(n)
3
S ( n) E ( n)
T (1)
Q ( n)
2
R ( n)
nT (n)O(n)
EENG-630 Chapter 3
21
Example of Performance
Given O(1) = T(1) = n3, O(n) = n3 + n2log n,
and T(n) = 4 n3/(n+3)
S(n)
E(n)
R(n)
U(n)
Q(n)
=
=
=
=
=
(n+3)/4
(n+3)/(4n)
(n + log n)/n
(n+3)(n + log n)/(4n2)
(n+3)2 / (16(n + log n))
EENG-630 Chapter 3
22
Dhrystone results
Measure of integer performance
Whestone results
Measure of floating-point performance
23
Drug design
High-speed civil transport
Ocean modeling
Ozone depletion research
Air pollution
Digital anatomy
EENG-630 Chapter 3
24
Fixed-time model
Demands constant program execution time
Fixed-memory model
Limited by the memory bound
EENG-630 Chapter 3
25
Algorithm Characteristics
26
Isoefficiency Concept
Relates workload to machine size n needed
to maintain a fixed efficiency
w( s )
E
w( s ) h( s, n)
workload
overhead
27
Isoefficiency Function
To maintain a constant E, w(s) should grow
in proportion to h(s,n)
E
w( s )
h( s, n)
1 E
C = E/(1-E) is constant for fixed E
f E ( n) C h( s, n)
EENG-630 Chapter 3
28
Gustafsons law
for scaled problems (problem size increases
with increased machine size)
Speedup model
for scaled problems bounded by memory
capacity
EENG-630 Chapter 3
29
Amdahls Law
As # of processors increase, the fixed load
is distributed to more processors
Minimal turnaround time is primary goal
Speedup factor is upper-bounded by a
sequential bottleneck
Two cases:
DOP < n
DOP n
EENG-630 Chapter 3
30
Wi
ti (i )
i
m
i
n
Wi
T ( n)
i 1 i
i
n
T (1)
Sn
T ( n)
Wi
t i ( n ) t i ( )
i
m
W
i i
Wi
i 1 i
i
n
EENG-630 Chapter 3
31
Gustafsons Law
With Amdahls Law, the workload cannot
scale to match the available computing
power as n increases
Gustafsons Law fixes the time, allowing
the problem size to increase with higher n
Not saving time, but increasing accuracy
EENG-630 Chapter 3
32
Fixed-time Speedup
As the machine size increases, have
increased workload and new profile
In general, Wi > Wi for 2 i m and
= W1
Assume T(1) = T(n)
EENG-630 Chapter 3
W1
33
Wi
Wi
i 1
i 1 i
m'
S
'
n
W
i 1
m
Q
(
n
)
n
'
W
i 1
'
W1 nWn
W1 Wn
EENG-630 Chapter 3
34
EENG-630 Chapter 3
35
Fixed-Memory Speedup
Let M be the memory requirement and W
the computational workload: W = g(M)
g*(nM)=G(n)g(M)=G(n)Wn
m*
S
*
n
W
i 1
m*
Wi *
i 1 i
i
n Q(n)
W1 G (n)Wn
W1 G (n)Wn / n
EENG-630 Chapter 3
36
37
Scalability Metrics
Machine size (n) : # of processors
Clock rate (f) : determines basic m/c cycle
Problem size (s) : amount of computational
workload. Directly proportional to T(s,1).
CPU time (T(s,n)) : actual CPU time for
execution
I/O demand (d) : demand in moving the
program, data, and results for a given run
EENG-630 Chapter 3
38
Scalability Metrics
Memory capacity (m) : max # of memory words
demanded
Communication overhead (h(s,n)) : amount of
time for interprocessor communication,
synchronization, etc.
Computer cost (c) : total cost of h/w and s/w
resources required
Programming overhead (p) : development
overhead associated with an application program
EENG-630 Chapter 3
39
T ( s,1)
S ( s, n)
T ( s, n) h( s, n)
S ( s , n)
E ( s, n)
n
EENG-630 Chapter 3
40
Scalable Systems
Ideally, if E(s,n)=1 for all algorithms and
any s and n, system is scalable
Practically, consider the scalability of a m/c
S ( s, n) TI ( s, n)
( s, n)
S I ( s, n) T ( s, n)
EENG-630 Chapter 3
41