You are on page 1of 4

ACA PROJECT REPORT

PERFORMANCE MEASURE OF PROCESSOR USING SIMPLESCALAR

Submitted To : Mrs. Aparna P Electronics & Communication NITK Submitted By : ASHWINI RAO (08EC14) TEJUSREE K (08EC26) B.S.PRIYANKA (08EC16) PRIYANKA .S (08EC52) Date of Submission : 14/11/2011

INTRODUCTION Computer architects are constantly researching new techniques in microprocessor design. Instead of spending time and money on implementing the new designs in hardware, computer architects first test these techniques on simulators. In computer terms, a benchmark is a special program that is used to characterize a computer systems, or, in this case, a microprocessors, performance in executing the program. The theory is that a benchmark program is representative of real-world applications, so a measure of how well a system performs in the execution of the benchmark is indicative of what the systems performance with actual applications will be like. Different applications tax computers in different ways. For the problem statement given, Simplescalar simulator toolkit has been used. In order to characterize the performance of the application, multiple cache and branch predictor configurations have been used and results have been analysed. A microprocessor simulator takes far longer to execute a benchmark than the physical processor that is being simulated would. Sim-outorder simulator is a detailed microarchitectural simulator and models timing. This tool models in detail in-order and out-of-order microprocessor, including branch prediction, caches, and external memory. This simulator is highly parameterized and can emulate machines of varying numbers of execution units.

Problem Statement
Using the sim-outorder simulator, simulate the program anagram with the configurations set as given below and analyze. A pipeline that fetches, decodes, issues, executes and commits one instruction/cycle, no matter what the instruction type Only one of each type of functional unit 8KB, two-way set-associative L1 instruction and data caches with 32 byte blocks A 256KB, direct-mapped L2 unified cache with 32 byte blocks An 8-way, 128-entry data TLB A 4-way, 64-entry instruction TLB All caches and the TLB have an LRU block replacement policy. The page size is 4KB. 2-bit dynamic branch prediction

Background
Anagram benchmark is a set of C source files and associated files. Given a set of words (dictionary) and a phrase as input the program anagram.c, finds out all the possible anagrams that can be formed and writes them into an output file.

The SimpleScalar toolset is a computer architecture research test bed consisting of compilers, assembler, linker, libraries, and simulators hosted on any Unix-like machine. For this project, we have used Ubuntu 10.04, hosted on i686 (P6) microarchitecture (Intel Core 2 Duo) for the simulation. It is necessary that the endian types of host processor and the simulator should match. In this case, we are using little-endian benchmark.

METHODOLOGY
We simulated anagram benchmark on the SimpleScalar 3.0 sim-outorder simulator. We have used the following configurations : 1) Cache Instruction Cache : L1 : 8KB , 2 way set associativity , LRU block replacement policy L2 : 256KB, direct-mapped ,unified cache with 32 byte blocks, LRU Data Cache : L1 : 8KB , 2 way set associativity , LRU block replacement policy ,32 bytes L2 : 256KB, direct-mapped ,unified cache with 32 byte blocks,LRU command : -cache:il1 il1:128:32:2:l -cache:il2 dl2 -cache:dl1 dl1:128:32:2:l -cache:dl2 ul2:8192:32:1:l 2) TLB Instruction TLB: 4-way, 64-entry, 4KB page size , LRU block replacement policy Data TLB: 8-way, 128-entry ,4KB page size , LRU block replacement policy command : -tlb:dtlb dtlb:16:4096:128:l -tlb:itlb itlb:16:4096:64:l 3) Branch prediction : 2 bit Dynamic branch predictor bimodal predictor with 2 bit counter and BTB of 1KB command : -bpred:perfect

Limitations A typical benchmark contains about 10 million iterations which taxes the processor and in turn helps analyze the performance of the processor. However, simple scalar takes a long time to execute such a huge piece of code, and hence we have used a program with lesser number of iterations. Since the simulator is using only one kind of branch prediction method, i.e. the bimod predictor, it was not possible to compare the performance of different branch prediction techniques.

RESULTS
The program anagram.c was succesfully executed and the following results were obtained. a) Branch Frequency Simulation Speed Number of branch instructions executed Branch Frequency = 609168 instructions/second = 162907 = Number of branch instructions executed Simulation Speed = 0.2674 branch instructions/second

b) How often are branches executed in our program = Number of branch instructions executed Total number of instructions executed = 162907 691869 = 0.2354 => 23.54% c) Prediction Accuracy Branch Direction Prediction Rate ( all hits / updates ) = 0.9408 =>94.08% d) Hits & Misses Total Lookups Total Hits Hit Rate = 167041 = 133461 = 133461 167041 = 0.7989 => 79.89% =8390 = 0.2011 => 20.11%

Total misses Miss Rate

CONCLUSION

A C program to derive anagrams of a given word sequence using a dictionary, was succesfully complied in simplescalar and excuted on the simoutorder platform and
the various performance parameters like CPI, prediction accuracy, hit rate were evaluated.

You might also like