Measuring and Reporting

Benchmark programs should be derived from how actual applications will execute.
However, performance is often the result of combined characteristics of a given

computer architecture and system software/hardware components in addition to the
microprocessor. Other factors such as the operating system, compilers, libraries,
memory design and I/O subsystem characteristics may also have impacts on the
results and make comparisons difficult.
4.1 Measuring Performance

Two ways to measure the performance are:
1. The speed measure - which measures how fast a computer completes a
single task. For example, the SPECint95 is used for comparing the ability of a
computer to complete single tasks.
2. The throughput measure - which measures how many tasks a computer can
complete in a certain amout of time. The SPECint_rate95 measures the rate
of a machine carrying out a number of tasks.
4.2 Interpreting Results

There are three important guidelines to remember when interpreting benchmark
results:
1. Be aware of what is being measured. When making critical purchasing
decisions based on results from standard benchmarks, it is very important to know
what is actually been measured. Without knowing, it is difficult to know whether the
measurements obtained is even relevant to the applications which will run on the
system being purchased. Questions to consider are: does the benchmark measure
the overall performance of the system or just components of the system such as the
CPU or memory?
2. Representativeness is key. How close is the benchmark to the actual
application being executed? The closer it is, the better it will be at predicting the
performance. For example, a component-level benchmark would not be good
predictors of performance for an application that would use the entire system.
Likewise, application benchmarks would be the most accurate predictors of
performance for individual applications.
3. Avoid single-measure metrics. Application performance should not be
measured with just a single number. No single numerical measurement can
completely describe the performance of a complex device like the CPU or the entire
system. Also, try to avoid benchmarks that average several results into a single
measurement. Important information may be lost in average values. Try to evaluate

all the results from different benchmarks that are relevant to the application. This
may give a more accurate picture than evaluating the results from one benchmark
alone.
4.3 Reporting Performance

There are some points to remember when reporting results obtained from running
benchmarks.
Use newer version over the older. If an updated and revised version of a
benchmark suite is available, it is usually preferred over the outdated one.
Generally there are good reasons for revising the original. They include, but
not limited to, changes in technology, improvements in compiler efficiency,
etc.
Use all programs in a suite. There may be legitimate reasons why only a
subset was used, but they should be explained. Otherwise, someone looking
at the results may become suspicious as to why the other programs were not
considered. Explain about the selection process, why it was not arbitrary, and
why it was useful to do so.
Report compilation mode. The compilation mode that was used is

important and should be reported in every case. The effect of a certain new
hardware feature may be dependent on whether it is applied to optimzed or
unoptimized programs.
Use a variety of benchmarks when reporting performance. Generally it

is a good idea to use other set of programs as additional test cases. One set
of benchmarks may behave differently than another set and such
observations may be useful as to the next round of benchmark selections.
List all factors affecting performance. Have enough information about

performance measurements to allow readers to duplicate the results. These
include:
1. program input
2. version of the program
3. version of compiler
4. optimizing level of compiled code
5. version of operating system
6. amount of main memory

7. number and types of disks
8. version of the CPU

Measuring and Reporting

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Measuring and Reporting

Uploaded by

Copyright:

Available Formats

Benchmark programs should be derived from how actual applications will execute.

However, performance is often the result of combined characteristics of a given

4.1 Measuring Performance

4.2 Interpreting Results

measurement. Important information may be lost in average values. Try to evaluate

4.3 Reporting Performance

Report compilation mode. The compilation mode that was used is

Use a variety of benchmarks when reporting performance. Generally it

List all factors affecting performance. Have enough information about

6. amount of main memory

You might also like