You are on page 1of 22

Pollachi Institute of Engineering and Technology

Department of Electronics and Communication Engineering


EC6009-ADVANCED COMPUTER ARCHITECTURE
QUESTION BANK

UNIT-I
PART-A

1. Define Computer Architecture

Computer Architecture is defined as the functional operation of the individual h/w unit in a
computer system and the flow of information among the control of those units

2. Define Computer H/W

Computer H/W Is The Electronic Circuit And Electro Mechanical Equipment That
Constitutes The Computer

3. What Is Meant By Cache Memory ?

A Memory That Is Smaller And Faster Than Main Memory And That Is Interposed Between
The Cpu And Main Memory. The Cache Acts As A Buffer For Recently Used Memory Location

4. what is locality of reference?

Many instruction in localized area of the program are executed repeatedly during some time period
and the remainder of the program is accessed relatively infrequently .this is referred as locality of
reference.

5. what is IO mapped input output?

A memory reference instruction activated the READ M (or)WRITE M control line and does not
affect the IO device. Separate IO instruction are required to activate the READ IOand WRITE IO
lines ,which cause a word to be transferred between the address aio port and the CPU. The memory
and IO address space are kept separate.

6.Specify the three types of the DMA transfer


techniques?

Single transfer mode(cyclestealing mode)


Block Transfer Mode(Brust Mode)
Demand Transfer Mode
Cascade Mode

7. why is memory refreshing circuit needed ?

al cells on the corresponding yow to be read and refreshed during both read and write
operation .the contents of the d ram are maintained each row of cell must be accessed
periodically once every 2 – 16 ms. refresh circuit usually performs this function .

8 what are the functions of control unit ?


the memory arithmetic and logic ,and input and output units store and process information and
perform i/p and o/p operation, the operation of these unit must be co ordinate in some way this
is the task of control unit the cu is effectively the nerve center that sends the control signal to
other units and sence their states.

9.What is an interrupt?

An interrupt is an event that causes the execution of one program to be suspended and another
program to be executed.

10.What are the uses of interrupts?

Recovery from errors


Debugging
Communication between programs
Use of interrupts in operating system

11.Define vectored interrupts.

In order to reduce the overhead involved in the polling process, a device requesting an interrupt
may identify itself directly to the CPU. Then, the CPU can immediately start executing the
corresponding interrupt-service routine. The term vectored interrupts refers to all interrupt-
handling schemes base on this approach.

12. What is the need for reduced instruction chip?

1. Relatively few instruction types and addressing modes.


2. Fixed and easily decoded instruction formats.
3. Fast single-cycle instruction execution.
4. Hardwired rather than microprogrammed control.

13. Name any three of the standard I/O interface.

1. SCSI (small computer system interface),bus standards


2. Back plane bus standards
3. IEEE 796 bus (multibus signals)
4. NUBUS
5. IEEE 488 bus standard
14. Differentiate between RISC and CISC

RISC CISC

1. Reduced Instruction Set Computer 1. Complex Instruction set computer


2. Simple instructions take one cycle per 2. Complex instruction take multiple
Operation. Cycles per operation.

3. Few instructions and address modes are 3. Many instruction and address
Used. Modes.

4. Fixed format instructions are used. 4. Variable format instructions are

used.

5.Instructions are compiled and then 5. Instructions are interpreted by the


executed by hardware. Microprogram and then executed.
6. RISC machines are multiple register 6. CISC machines use single
register set. Set.
7. Complexity in the compiler 7. Complexity in the microprogram

8. RISC machines are higly piplined 8. CISC machines are not piplined.

15.Explain the pipeline types.

1. Instruction pipeline
2. Arithmetic pipeline

16. Explain the various classifications of parallel structures.

1. SISD (single instruction stream single data stream


2. SIMD(single instruction stream multiple data stream
3. MIMD(multiple instruction stream multiple data stream
4. MISD(multiple instruction stream single data stream

17. What is absolute addressing mode?

The address of the location of the operand is given explicitly as a part of the instruction.
Eg. Move a , 2000

18. Specify three types of data transfer techniques.

1. Arithmetic data transfer


2. Logical data transfer
3. Programmed control data transfer

19. What is the role of MAR and MDR?

The MAR (memory address register) is used to hold the address of the location to or from
which data are to be transferred and the MDR(memory data register) contains the data to be
written into or read out of the addressed location.
20. What are the various types of operations required for instructions?
1. Data transfers between the main memory and the CPU registers
2. Arithmetic and logic operation on data
3. Program sequencing and control
4. I/O transfers

21. What is the role of IR and PC?

Instruction Register (IR) contains the instruction being executed. Its output is available to the
control circuits, which generate the timing signals for controlling the processing circuits needed
to execute the instructions.
The Program Counter (PC) register keeps track of the execution of the program. It contains the
memory address of the instruction currently being executed . During the execution of the
current instruction, the contents of the PC are updated to correspond to the address of the next
instructions to be executed.
22.Define memory access time?

The time that elapses between the initiation of an operation and completion of that
operation ,for example ,the time between the READ and the MFC signals .This is
Referred to as memory access time.

23. Define memory cycle time.

The minimum time delay required between the initiations of two successive memory operations,
for example, the time between two successive READ operations.
24.Define Static Memories.

Memories that consist of circuits capable of retaining the state as long as power is applied are
known as static memories.

25.Distinguish Between Static RAM and Dynamic RAM?

Static RAM are fast, but they come at high cost because their cells require several transistors.
Less expensive RAM can be implemented if simpler cells are used. However such cells do not
retain their state indefinitely; Hence they are called Dynamic RAM.

26.Distiguish between asynchronies DRAM and synchronous RAM.

The specialized memory controller circuit provides the necessary control signals, RAS And CAS
,that govern the timing. The processor must take into account the delay in the response of the
memory. Such memories are referred to as asynchronous DRAMS.The DRAM whose operations is
directly synchronized with a clock signal. Such Memories are known as synchronous DRAM

27.what are the various units in the computer?

1.input unit
2.output unit
3.control unit
4.memory unit
5.arithmetic and logical unit

28.what is an I/O channel?


An i/o channel is actually a special purpose processor, also called peripheral processor.
The main processor initiates a transfer by passing the required information in the input output
channel. the channel then takes over and controls the actual transfer of data.

29.what is a bus?
A collection of wires that connects several devices is called a bus.

30.Define word length?


Each group of n bits is referred to as a word of information and n is called the word length.

31.explain the following the address instruction?

1. three-address instruction-it can be represented as add a,b,c


Operands a,b are called source operand and c is called destination operand.
2.two-address instruction-it can be represented as add a,b
3.one address instruction-it can be represented as add a
4.zero address instruction.

It is also possible to use instruction where the location s of all operand are defined implicitly.
This operand of the use of the method for storing the operand in which called push down stack.
Such instructions are sometimes referred to us zero address instruction.

32.what is the straight-line sequencing?

the cpu control circuitry automatically proceed to fetch and execute instruction, one at a
time in the order of the increasing addresses. This is called straight line sequencing.

33.what is the role of pc?

The cpu contains a register called the program counter, which holds the address of
instruction to be executed next.. to begin the execution of the program the address of its
First instruction must be placed into the pc.

34.what are steps for execution of a complete instruction?

1.fetch the instruction.


2.fetch the first operand (the contents of the memory location pointed by the address field of
the instruction.)
3.perform the calculation.
4.load the result.

35.what is a a bit slice?

A bit slice is “slice” through the data path of the typical processor. It contains all

Circuits necessary to provide alu function, register transfer and control function for only a few
bits of the data path.

36.what is DMA?

A special control unit may be provided to enable transfer a block of data directly
between an external device and memory without contiguous intervention by the cpu. This
approach is called DMA.
37.why program controlled I/O is unsuitable for high-speed data transfer?

1. in program controlled i/o considerable overhead is incurred.. because several program


instruction have to be executed for each data word transferred between the external
devices and MM.
2. many high speed peripheral; devices have a synchronous modes of operation. that
is data transfer are controlled by a clock of fixed frequency, independent of the cpu.

38.what is the function of i/o interface?

The function is to coordinate the transfer of data between the cpu and external devices.

39.what is NUBUS?

A NUBUS is a processor independent, synchronous bus standard intended for use in

32 bit micro processor system. It defines a backplane into which upto 16 devices may be
plugged each in the form of circuit board of standard dimensions.

40. what do you mean associative mapping technique?

The tag of an address received from the CPU is compared to the tag bits of each block of the
cache to see if the desired block is present. This is called associative mapping technique.

41. What is LRU replacement algorithm?

When a block is to be overwritten it is sensible to overwrite the one that has gone the largest
time without being referenced. This block is called Least Recently Used block and the
technique is called LRU replacement algorithm.

42. Explain virtual memory technique.

Techniques that automatically move program and data blocks into the physical memory
when they are required for execution are called virtual memory technique.

43. What are virtual and logical addresses?

The binary addresses that the processor issues for either instruction or data are called
virtual or logical addresses.

44. Define translation buffer.

Most commercial virtual memory systems incorporate a mechanism that can avoid the bulk of
the main memory access called for by the virtual to physical addresses translation buffer.
This may be done with a cache memory called a translation buffer, which retains the results of
the recent translation.

45. Name some of the IO devices.

1. Video terminals
2. Video displays
3. Alphanumeric displays
4. Graphics displays
5. Flat panel displays
6. Printers
7. Plotters

46. What is branch delay slot?

The location containing an instruction that may be fetched and then discarded because of
the branch is called branch delay slot.

47. What is optical memory?

Optical or light based techniques for data storage, such memories usually employ optical
disk which resemble magnetic disk in that they store binary information in concentric tracks on
an electromechanically rotated disks. The information is read as or written optically, however
with a laser replacing the read write arm of a magnetic disk drive. Optical memory offer high
storage capacities but their access rate is are generally less than those of magnetic disk.

48. What is microprogrammed control?

Microprogrammed control in which control signals are generated by a program similar to


machine language program. A sequence of one (or) more microoperation, such as addition,
multiplication is called a micro program. The address where these microinstructions are stored in
CM is generated by microprogrammed control.

49. What is microprogramming?

A sequence of Control Words corresponding to the control sequence of a machine


instruction constitutes the micro routine for that instruction, and the individual control words in
this micro routine are referred to as microinstructions.
Microprogramming is a method of control unit design in which the control signal selection
and sequencing information is stored in a ROM (or) RAM called a control memory CM.

50.What are static and dynamic memories?

Static memory are memories which require periodic no refreshing. Dynamic memories are
memories, which require periodic refreshing.

51. What are the steps required for a pipelinened processor to process the instruction?

1. F Fetch: read the instruction from the memory


2. D Decode: decode the instruction and fetch the source operand(s).
3. E Execute: perform the operation specified by the instruction.
4. W Write: store the result in the destination location.

52. What are the steps taken when an interrupt occurs?


1. Source of the interrupt
2. The memory address of the required ISP
3. The program counter & cpu information saved in subroutine
4. Transfer control back to the interrupted program

53. Define instruction pipeline.

The transfer of instructions through various stages of the cpu instruction cycle., including
fetch opcode,decode opcode,compute operand addresses. Fetch operands, execute instructions
and store results. This amounts to realizing most (or) all of the cpu in the form of multifunction
pipeline called an instruction pipelining.

54. Define latency.

The term memory latency is used to refer to the amount of time it takes to transfer a word of
data to or from the memory. The term latency is used to denote the time it takes to transfer the
first word of data. This time is usually substantially longer than the time needed to transfer each
subsequent word of a block.

55. Define bandwidth.

Bandwidth is a product of the rate at which the data are transferred (and accessed) and the
width of the data bus.

56. Define hit rate.

A successful access to data in a cache is called a hit. Number of hits stated as a fraction of all
attempted accesses is called the hit rate.

57. Define miss rate.

A miss rate is the number of misses stated as a fraction of attempted accesses. Extra time
needed to bring the desired information into the cache is called the miss penalty.

58. Distinguish between system space and user space.

assembling the operating system routine into a virtual address space , is called system
space that is separate from the virtual space in which user application program reside. The
latter space is called user space.

59.Define

1. Signal - The binary information is represented in digital computers by physical quantities


called signals.
2 Gates – The manipulation of binary information is done by logic circuits called gates.
Gates are blocks of hardware that produce signals of binary 1 or 0 where input logic
requirements are satisfied.
3. Flip flop – The storage elements employed in clocked sequential circuits are called flip
flops. A flip flop is a binary cell capable of storing 1 bit of information.

60. Define combinational circuit.

A combinational circuit is a connected arrangement of logic gates with the set of inputs and
outputs. A combinational circuit transforms binary information from the given input data to the
required output data. Combinational are employed in digital computers required for generating
binary control decisions and for providing digital components required for data processing.
61. Define sequential circuits.

A sequential circuit is an interconnectin of flip-flops and gates. The gates by themselves


constitute a combinational circuit, but when included with the flip flops, the overall circuit is
classified as a sequential circuit.

62.Define interface.
The word interface refers to the boundary between two circuits or devices.

63. Define pipelining.

Pipelining is a techinique of decomposing a sequential process into sub operations with


each subprocess being executed in a special dedicated segment that operates
concurrently with all other segments.

64.Define parallel processing.


Parallel processing is a term used to denote a large class of techniques that are used to
provide simultaneous data-processing tasks for the purpose of increasing the
computational speed of a computer system. Instead of processing each instruction
sequentially as in a conventional computer, a parallel processing system is able to
perform concurrent data processing to achieve faster execution time.

65.What are the components of memory management unit?


1. A facility for dynamic storage relocation that maps logical memory references into
physical memory addresses.
2 A provision for sharing common programs stored in memory by different users .

2. Protection of information against unauthorized access between users and preventing


users from changing operating system functions.

66. What is programmed I/O?


Data transfer to and from peripherals may be handled using this mode. Programmed
I/O operations are the result of I/O instructions written in the computer program.
UNIT-II

PART-A
1. Explain the concept of pipelining.
Pipelining is an implementation technique whereby multiple instructions are overlapped in
execution. It takes advantage of parallelism that exists among actions needed to execute an
instruction.

2. What is a Hazard? Mention the different hazards in pipeline.


Hazards are situations that prevent the next instruction in the instruction stream from
executing during its designated clock cycle. Hazards reduce the overall performance from the
ideal speedup gained by pipelining. The three classes of hazards are,
i) Structural hazard
ii) Data hazard
iii) Control hazard

3. List the various dependences.


Data dependence Name
dependence Control
dependence

4. What is Instruction Level Parallelism? (or) What is ILP?


Pipelining is used to overlap the execution of instructions and improve performance.
This potential overlap among instructions is called instruction level parallelism (ILP)
since instruction can be evaluated parallel.

5. Give an example of control dependence.


If p1 { s1;}
If p2 {s2;}
S1 is control dependence on p1, and s2 is control dependence on p2.

6. Write the concept behind using reservation station.


Reservation station fetches and buffers an operand as soon as available, eliminating the
need to get the operand from a register.

7. Explain the idea behind dynamic scheduling? Also write advantages of dynamic
scheduling.
In dynamic scheduling the hardware rearranges the instruction to reduce the stalls while
maintaining dataflow and exception behavior.
Advantages: it enables handling some cases when dependence are unknown at compile time
It allows code that was compiled with one pipeline in mind and run efficiently on a different
pipeline.

8. What are the possibilities of imprecise exception?


The pipeline may have already completed instructions that are later in program order than
instruction causing exception. The pipeline may have not yet completed some instructions
that ere earlier in program order than the instructions causing exception.

9. What are branch target buffers?


To reduce the branch penalty we need to know from what address to fetch by end of
Instruction fetch. A branch prediction cache that stores the predicted address for the next
instruction after a branch is called a branch-target buffer or branch target cache.

10. Mention the idea behind hardware-based speculation?


It combines three key ideas:
Dynamic branch prediction to choose which instruction to execute,
Speculation to allow the execution of instructions before control dependence are
resolved Dynamic scheduling to deal with the scheduling of different combinations of
basic blocks.

11.What are the fields in the ROB?


Instruction type, destination field, value field and ready field.

12.Mention the advantage of using tournament based predictors?


The advantage of tournament predictor is its ability to select the right predictor for
right branch.

13.What is loop unrolling?


A simple scheme for increasing the number of instructions relative to the branch
and overhead instructions is loop unrolling. Unrolling simply replicates the loop
body multiple times, adjusting the loop termination code.

14.What are the basic compiler techniques?


Loop unrolling, Code scheduling.

15.Give an example for data dependence.


Loop: L.D F0, 0(R1)
ADD.D F4, F0, F2
S.D F4, 0(R1)
DADDUI R1, R1, #-8
BNE R1, R2,
LOOP

PART B

1. Explain the data dependences and hazards in detail with examples.


(16) Data Dependences and Hazards
Name dependencies and Hazards
Control dependencies and
Hazards
2.Briefly explain the concept of dynamic scheduling. (8)

Dynamic scheduling
3.Explain dynamic scheduling using tomosulo’s approach with an example. (or)
Explain technique used for overcoming the data hazards with an example.
-Diagram

-Use of reservation stations


Components of tomosulo’s architecture:

4.Explain in detail about the hardware based speculation and explain how it
overcomes the control dependencies. (16)
Hardware-based speculation combines three key
ideas: Role of re-order buffer
Issue
Execute
Write result
Commit
5.Explain the basic compiler techniques for exposing ilp.
(8) Basic Pipeline Scheduling and Loop
Unrolling
6. Explain static and dynamic branch prediction schemes (16)

Static Branch Prediction


Correlating Branch
Predictors
Tournament Predictors: Adaptively Combining Local and Global Predictors

UNIT-III
PART-A

1. Explain the VLIW approach?

They uses multiple, independent functional units. Rather than attempting to issue multiple,
independent instructions to the units, a VLIW packages the multiple operations into one very
long instruction.

2. Mention the advantage of using multiple issue processors?

They are less expensive, they have cache based memory system, more parallelism.
3. What are loop carried dependence?

They focus on determining whether data accesses in later iterations are dependent on data
values produced in earlier iterations; such a dependence is called loop carried dependence.
E.g for(i=1000;i>0;i=i-1)

X[i] = x[i] +s;

4. Use the G.C.D test to determine whether dependence exists in the following loop.

for(i=1;i<=100;i=i+1)
X[2*i+3]=X[2*i]*5.0; Solution:
a=2, b=3, c=2, d=0 GCD(a,c)=2
and d-b=-3
Since 2 does not divide -3, no dependence is possible.
5. What is software pipelining?

Software pipelining is a technique for reorganizing loops that each iteration in the software
pipelined code is made from instruction chosen from different iterations of the original loop.

6. What is global code scheduling?

Global code scheduling aims to compact code fragment with internal control structure into
the shortest possible sequence that preserves the data and control dependence. Finding a
shortest possible sequence is finding the shortest sequence for the critical path.

7. What is Trace?
Trace selection tries to find a likely sequence of basic blocks whose operations will be put
together into smaller number of instructions, this sequence is called trace.

8. What are the steps involved in Trace scheduling?

Trace selection, Trace compaction, Book Keeping.


9. What is superblock?
Superblocks are formed by a process similar to that used for traces, but are form of extended
basic block, which are restricted to a single entry point but allow multiple exits.
10. What is poison bit?
Poison bits are a set of status bits that are attached to the result registers written by the
specified instruction when the instruction causes exceptions. The poison bits cause a fault
then a normal instruction attempts to use the register.

11. What are the disadvantages of supporting speculation in hardware?

Complexity, additional hardware required.

12. What is an instruction group?

It is a sequence of consecutive instructions with no register data dependences among them.


All the instructions in the group could be executed in parallel. An instruction group can be
arbitrarily long.
13. What are the limitations of ILP?

The hardware model, limitation on the window size and maximum issue count, the effects of
realistic branch and jump prediction, the effects of finite registers, the effect of imperfect
alias analysis.

14. What is copy propagation?

Within a basic block, algebraic simplification of expressions and an optimization called copy
propagation. Which eliminates operations that copy values, can be used to simplify
sequences like the following:

DADDUI R1, R2, #4


DADDUI R1, R1, #4 to
DADDUI R1, R2, #8
15. What is CFM?

A special register called the current frame pointer points to the set of registers to be used
by a given procedure.
PART B

1. Explain the VLIW and EPIC processors. (8)

 ILP in VLIW

 EPIC ARCHITECTURE

 Instruction-level parallelism in EPIC

 Static Multiple Issue : The VLIW Approach

 The Basic VLIW Approach

2. Explain hardware support for compilers for exposing ilp. (16)

 Conditional or Predicated Instructions

 Compiler Speculation with Hardware Support

 Hardware Support for Preserving Exception Behavior

 Hardware Support for Memory Reference Speculation

3.Hardware versus software speculation mechanisms.


(8) 4.Explain the i7 architecture. (10)
5.Advanced compiler support for exposing and exploiting ilp (16).

 Detecting and Enhancing Loop-Level Parallelism 


 Identifying dependencies 
 Copy propagation 
 Software pipelining 
 Global code scheduling 
 Trace scheduling 

6.Briefly explain global code scheduling(6)


7. Explain software pipelining (6)

UNIT-IV
PART-A

1. What are multiprocessors? Mention the categories of multiprocessors?


Multiprocessors are used to increase performance and improve availability. The different
categories are
SISD(Single Instruction and Single Data stream),
SIMD(Single Instruction and Multiple Data stream),
MISD(Multiple Instruction and Single data stream)
MIMD(Multiple Instruction and Multiple Data
stream)

2. What are threads?


These are multiple processors executing a single program and sharing the code and
most of their address space. When multiple processors share code and data in the way, they are
often called threads.

3. What is cache coherence problem?


Two different processors have two different values for the same location.

4. What are the ways to maintain cache coherence? OR what are the ways to enforce
cache coherence?
Directory based protocol, Snooping based protocol.

5. What are the ways to maintain cache coherence using snooping protocol?
Write invalidate protocol, write update or write broadcast protocol.
6. What is write invalidate and write update protocol?
 Write invalidate provide exclusive access to caches. These exclusive caches ensure that
no other readable or writable copies of an item exist when the write occurs. 
Write updates protocol updates all cached copies of a data item when that item is
 written. 

7. What are the disadvantages of using symmetric shared memor?


Compiler mechanisms are very limited, and create large latency for remote memory
access fetching multiple words in a single cache block will increase the cost.

8. Mention the information available in the directory?


The directory keeps the state of each block that is cached. It keeps track of which caches
have copies of the block.

9. What are the states of cache block in directory based approach?


Shared, Un-cached and Exclusive.

10. What are the uses of having a bit vector?


When a block is shared, the bit vector indicates whether the processor has the copy of
the block. When block is in exclusive state, bit vector keep track of the owner of the block.

11. When do we say that a cache block is exclusive?


When exactly one processor has the copy of the cache block, and it has written the
block, then the processor is called the owner of the block. During this phase the block will be
in exclusive state.

12. Explain the types of messages that can be send between the processors and directories?
Local Node: node where the requests originates
Home Node: Node where memory location and directory entry of the address resides.
Remote Note: the copy of the block in the third node called remote node.

13. What is consistency? And what are the models used for consistency?
Consistency says in what order processor must observe the data writes of another
processor.
Models used for Consistency:
Sequential Consistency
Model, Relaxed Consistency
Model

14. What is sequential consistency?


Sequential consistency requires that the result of any execution be the same, as if the
memory accesses executed by each processor were kept in order and the accesses among
different processors were interleaved.

15. What is relaxed consistency model?


Relaxed consistency model allows reads and writes to be executed out of
order. Three sets of ordering are:
W->R Ordering – Total store
ordering W->W Ordering – Partial
ordering
R->W and R->R Ordering – Weak ordering
16. What is coarse grained and fine grained multithreading?
Coarse Grained: it switches only on costly stalls (or) event. Thus it is much less likely to
slow down the execution of an individual thread.
Fine Grained: it switches threads between threads on each instruction, causing the
execution of multiple threads to be interleaved.
17.What is software and hardware multithreading?
 Software multithreading is piece of software that is aware of more than one
core/processor, and can use these to be able simultaneously complete multiple tasks.
 Hardware multithreading is a hardware that is aware of more than one
core/processors and can use these to be able simultaneously completes multiple
threads.

18.Write short note on SMT.


Simultaneous multithreading is a technique for improving the overall efficiency of
superscalar CPU’s with hardware multithreading. SMT permits multiple independent threads
of execution to better utilize the resources provided by modern processor architectures.

19.List the design challenges of SMT.


 Larger register file needed to hold multiple contexts. 
 Not affecting clock cycle time, especially its, 
o Instruction issue: more candidate instructions need to be considered.
 Ensuring that cache and TLP conflicts generated by SMT do not degrade
performance. 

20. Define CMP.


Chip level multiprocessing(CMP or multi-core): integrates two or more independent
cores into a single package composed of a single integrated circuit, called a die, or more dies
packaged, each executing threads independently.

21.Define chip multithreading:


 Chip multithreading is defined as the combination of chip multiprocessing and
the hardware multithreading.
 Chip multithreading is the capability of a processor to process multiple software
threads and simultaneous Hardware threads of execution.
22.Define multicore processors.
A multicore design in which a single physical processor contains the core logic of more
than one processor.

23.What is heterogeneous multicore processor?


The architecture consists of a chip-level multi-processors with multiple, processor
cores. These cores all execute the same instruction set, but include significantly different
resources and achieve different performance and energy efficiency on the same application.

24.List out the advantages of heterogeneous multicore processors.


 More efficient adaptation to application diversity.
 Applications place different demands on different architectures.
 A more efficient use of die area for a given thread-level parallelism.
 Chip level multiprocessors with a mix of cores.

25.What is IBM Cell processor?


Cell is a microprocessor architecture jointly developed by sony, sony computer
entertainment, Toshiba and IBM, an alliance known as ―STI‖. The architectural design and
first implementation were carried out at the STI design center in Austin, Texas over four-
year period.

26.List out the components of cell processors.

 External input and output structures


 The main processor called the Power Processing Element(PPE)
 Eight fully functional co-processors called the Synergistic Processing elements or SPE’s.
 A specialized high bandwidth circular data bus connecting the PPE.
27.What is memory flow controller?
The memory flow controller is the interface between the SPU and the rest of the cell chip.

28.State fine grained multi-threading.


 Switches between threads on each instructions, causing the execution of multiple
threads to be interleaved.
 Usually done in a round robin fashion, skipping any stalled
threads. CPU must be able to switch threads every clock.
29.List Advantages of coarse grain multithreading.

 It is hard to overcome throughput losses from shorter stalls, due to pipeline start-up
costs.
 Since CPU, issues instruction from 1 thread, when a stall occurs, the pipeline must
be emptied or frozen.
 New thread must fill pipeline before instruction can complete.

30.Define coarse grain multithreading.


It switches thread only on costly stalls, such as L2 cache misses.

PART B

1. Explain the symmetric shared memory architecture. Explain the snooping based
protocols with neat diagram. (16)
Symmetric Shared Memory Architectures:
Cache Coherence in Multiprocessors:
Basic Schemes for Enforcing Coherence
Snooping Protocols Write invalidate Write update protocol
State transition diagram
2.Explain the concept of distributed shared memory and also explain directory
based protocols with an example. (or) Explain the numa architecture with neat
diagram.
Directory-Based Cache-Coherence Protocols: The Basics

3.Explain the basics of multithreading and it types. (8) (or) Explain how
multihreading approach can be used to exploit thread level parallelism within
a processor? (8)

 Fine-grained multithreading
 Coarse-grained multithreading
 Simultaneous Multithreading
 Design Challenges in SMT processors

4. Explain in detail about the need for consistency models and it types.(8)

Sequential consistency

Relaxed Consistency Models:

5. Explain the performance of the symmetric shared memory with necessary graphs. (16)

Distribution of Execution Time in the Multi-programmed Parallel Make Workload


Data Miss Rate VS. Data Cache Size

6.Briefly explain multicore architecture with neat diagram and also write its applications.
(8) (or) chip multiprocessors (or) cmp
architecture. Single-core computer
CMP architecture:
CHIP multi threading:
Multi- core
architectures

7.Explain in detail about the various types of software and hardware based
multi- threading. (8)
Software and hardware
multithreading Multithreading in
Hardware Hardware Multithreading
Techniques
Cycle-by-cycle interleaving (Fine Grained
Multithreading) Block interleaving (Coarse Grain
Multithreading)

8.Explain the architectural features of ibm cell processors with neat diagram. (or) explain
cell processors. (or) explain cell broadband engine. (10)
IBM Cell Processor Applications
of cell processors Power
Processor Element (PPE)
Synergistic Processing Elements
Input/output interfaces
9.Explain simultanous multi-threading concept for converting thread level parallelism
into instruction level parallelism. (8)
Design Challenges in SMT
Potential Performance Advantages from SMT

UNIT-V
PART-A

1. What is cache miss and cache hit?


 Cache Miss: When the CPU finds a requested data item in the cache, it is called
cache miss.
 Cache Hit: When the CPU finds a requested data item is available in the cache, it is
called cache hit.

2. What is write through and write back cache?


 Write through cache: The information is written to both the block in the cache and to
the block in the lower level memory.
 Write Back Cache: The information is written only to the block in the cache. The
modified cache block is written to main memory only when it is replaced.

3. What is Miss Rate and Miss Penalty?


 Miss Rate is the fraction of cache access that results in a miss.
 Miss Penalty depends on the number of misses and clock
per miss.
4. Write the equation of Average memory access time.

Average memory access time = hit time+miss rate X Miss Penalty

5. What is stripping?

Spreading multiple data over multiple disks is called stripping, which automatically forces
accesses to several disks.

6. What is disk mirroring? And write the drawbacks of disk mirroring.

 Disks in the configuration are mirrored or copied to another disk. With this arrangement
data on the failed disks can be replaced by reading it from the other mirrored disks.
 Drawback: writing onto the disk is slower since the disks are not synchronized, seek
time will be different.
 It imposes 50% space penalty hence expensive.

7. Mention the factors that measure I/O performance ?


Diversity capacity, response time, throughput, interference with CPU execution.

8. What is transaction time?

The sum of entry time, response time and think time is called transaction time.
9. State little law?

Little law relates the average number of tasks in the system. It relates to Average arrival
rate of new tasks with the average time to perform a task.

10. What are the steps to design an I/O system?


 Naïve cost – performance design and evaluation 

 Availability of naïve design 

 Response time 

 Realistic cost performance, design and evaluation

 Realistic design for availability and its evaluation 

11. Write the classification of buses.

I/O buses – these buses are lengthy and have any types of devices connected to it. CPU
memory buses – They are short and generally of high speed.
12. What is bus master?

Bus master are devices that can initiate the read or write transaction.
Eg. Processor – processor are always has the bus mastership.

13. Mention the advantages of using bus master.

It offers higher bandwidth by using packets, as opposed to holding the bus for full
transaction.

14. What is split transaction?

The idea behind this is to split the bus into request and replies, so that the bus can be used in
the time between request and the reply.

15. What are the measures of latency in memory technology?

Access Time: Is the time between when a read is required and when the desired word
arrives. Cycle Time: Is the minimum time between requests to memory.

PART B

1. Explain in detail about the vaious optimization techniques for improving the cache
performance. (16)

Cache performance
There are 17 cache optimizations into four categories:
First Miss Penalty Reduction Technique: Multi-Level Caches
Second Miss Penalty Reduction Technique: Critical Word First and Early Restart
Third Miss Penalty Reduction Technique: Giving Priority to Read Misses over
Writes
Fourth Miss Penalty Reduction Technique: Merging Write Buffer
Fifth Miss Penalty Reduction Technique: Victim Caches
First Miss Rate Reduction Technique: Larger Block Size
Second Miss Rate Reduction Technique: Larger caches Miss
Rate Reduction Technique: Higher Associativity
Fourth Miss Rate Reduction Technique: Way Prediction and Pseudo-Associative
Caches Fifth Miss Rate Reduction Technique: Compiler Optimizations

2. What is virtual memory? write techniques for fast address translation


Techniques for Fast Address Translation
Selecting a Page Size
3. Explain various types of storage devices (16)

Magnetic
Disks Optical
Disks
Magnetic Tape
Automated Tape Libraries
Flash Memory
4. Explain the buses and i/o devices (16)

Bus Design Decisions


Bus Standards
Interfacing Storage Devices to the CPU

5.Briefly explain raid (or) redundant arrays of inexpensive disks (8)


No Redundancy (RAID 0)
Mirroring (RAID 1)
Bit-Interleaved Parity (RAID 3)

Block-Interleaved Parity and Distributed Block-Interleaved Parity (RAID 4 and RAID


5)

6.Explain the design of i/o systems and its performance. (16)


o The CPU
o The memory system:

o Internal and external caches, Main Memory


o The underlying interconnection (buses)

You might also like