Professional Documents
Culture Documents
UNIT-I
PART-A
Computer Architecture is defined as the functional operation of the individual h/w unit in a
computer system and the flow of information among the control of those units
Computer H/W Is The Electronic Circuit And Electro Mechanical Equipment That
Constitutes The Computer
A Memory That Is Smaller And Faster Than Main Memory And That Is Interposed Between
The Cpu And Main Memory. The Cache Acts As A Buffer For Recently Used Memory Location
Many instruction in localized area of the program are executed repeatedly during some time period
and the remainder of the program is accessed relatively infrequently .this is referred as locality of
reference.
A memory reference instruction activated the READ M (or)WRITE M control line and does not
affect the IO device. Separate IO instruction are required to activate the READ IOand WRITE IO
lines ,which cause a word to be transferred between the address aio port and the CPU. The memory
and IO address space are kept separate.
al cells on the corresponding yow to be read and refreshed during both read and write
operation .the contents of the d ram are maintained each row of cell must be accessed
periodically once every 2 – 16 ms. refresh circuit usually performs this function .
9.What is an interrupt?
An interrupt is an event that causes the execution of one program to be suspended and another
program to be executed.
In order to reduce the overhead involved in the polling process, a device requesting an interrupt
may identify itself directly to the CPU. Then, the CPU can immediately start executing the
corresponding interrupt-service routine. The term vectored interrupts refers to all interrupt-
handling schemes base on this approach.
RISC CISC
3. Few instructions and address modes are 3. Many instruction and address
Used. Modes.
used.
8. RISC machines are higly piplined 8. CISC machines are not piplined.
1. Instruction pipeline
2. Arithmetic pipeline
The address of the location of the operand is given explicitly as a part of the instruction.
Eg. Move a , 2000
The MAR (memory address register) is used to hold the address of the location to or from
which data are to be transferred and the MDR(memory data register) contains the data to be
written into or read out of the addressed location.
20. What are the various types of operations required for instructions?
1. Data transfers between the main memory and the CPU registers
2. Arithmetic and logic operation on data
3. Program sequencing and control
4. I/O transfers
Instruction Register (IR) contains the instruction being executed. Its output is available to the
control circuits, which generate the timing signals for controlling the processing circuits needed
to execute the instructions.
The Program Counter (PC) register keeps track of the execution of the program. It contains the
memory address of the instruction currently being executed . During the execution of the
current instruction, the contents of the PC are updated to correspond to the address of the next
instructions to be executed.
22.Define memory access time?
The time that elapses between the initiation of an operation and completion of that
operation ,for example ,the time between the READ and the MFC signals .This is
Referred to as memory access time.
The minimum time delay required between the initiations of two successive memory operations,
for example, the time between two successive READ operations.
24.Define Static Memories.
Memories that consist of circuits capable of retaining the state as long as power is applied are
known as static memories.
Static RAM are fast, but they come at high cost because their cells require several transistors.
Less expensive RAM can be implemented if simpler cells are used. However such cells do not
retain their state indefinitely; Hence they are called Dynamic RAM.
The specialized memory controller circuit provides the necessary control signals, RAS And CAS
,that govern the timing. The processor must take into account the delay in the response of the
memory. Such memories are referred to as asynchronous DRAMS.The DRAM whose operations is
directly synchronized with a clock signal. Such Memories are known as synchronous DRAM
1.input unit
2.output unit
3.control unit
4.memory unit
5.arithmetic and logical unit
29.what is a bus?
A collection of wires that connects several devices is called a bus.
It is also possible to use instruction where the location s of all operand are defined implicitly.
This operand of the use of the method for storing the operand in which called push down stack.
Such instructions are sometimes referred to us zero address instruction.
the cpu control circuitry automatically proceed to fetch and execute instruction, one at a
time in the order of the increasing addresses. This is called straight line sequencing.
The cpu contains a register called the program counter, which holds the address of
instruction to be executed next.. to begin the execution of the program the address of its
First instruction must be placed into the pc.
A bit slice is “slice” through the data path of the typical processor. It contains all
Circuits necessary to provide alu function, register transfer and control function for only a few
bits of the data path.
36.what is DMA?
A special control unit may be provided to enable transfer a block of data directly
between an external device and memory without contiguous intervention by the cpu. This
approach is called DMA.
37.why program controlled I/O is unsuitable for high-speed data transfer?
The function is to coordinate the transfer of data between the cpu and external devices.
39.what is NUBUS?
32 bit micro processor system. It defines a backplane into which upto 16 devices may be
plugged each in the form of circuit board of standard dimensions.
The tag of an address received from the CPU is compared to the tag bits of each block of the
cache to see if the desired block is present. This is called associative mapping technique.
When a block is to be overwritten it is sensible to overwrite the one that has gone the largest
time without being referenced. This block is called Least Recently Used block and the
technique is called LRU replacement algorithm.
Techniques that automatically move program and data blocks into the physical memory
when they are required for execution are called virtual memory technique.
The binary addresses that the processor issues for either instruction or data are called
virtual or logical addresses.
Most commercial virtual memory systems incorporate a mechanism that can avoid the bulk of
the main memory access called for by the virtual to physical addresses translation buffer.
This may be done with a cache memory called a translation buffer, which retains the results of
the recent translation.
1. Video terminals
2. Video displays
3. Alphanumeric displays
4. Graphics displays
5. Flat panel displays
6. Printers
7. Plotters
The location containing an instruction that may be fetched and then discarded because of
the branch is called branch delay slot.
Optical or light based techniques for data storage, such memories usually employ optical
disk which resemble magnetic disk in that they store binary information in concentric tracks on
an electromechanically rotated disks. The information is read as or written optically, however
with a laser replacing the read write arm of a magnetic disk drive. Optical memory offer high
storage capacities but their access rate is are generally less than those of magnetic disk.
Static memory are memories which require periodic no refreshing. Dynamic memories are
memories, which require periodic refreshing.
51. What are the steps required for a pipelinened processor to process the instruction?
The transfer of instructions through various stages of the cpu instruction cycle., including
fetch opcode,decode opcode,compute operand addresses. Fetch operands, execute instructions
and store results. This amounts to realizing most (or) all of the cpu in the form of multifunction
pipeline called an instruction pipelining.
The term memory latency is used to refer to the amount of time it takes to transfer a word of
data to or from the memory. The term latency is used to denote the time it takes to transfer the
first word of data. This time is usually substantially longer than the time needed to transfer each
subsequent word of a block.
Bandwidth is a product of the rate at which the data are transferred (and accessed) and the
width of the data bus.
A successful access to data in a cache is called a hit. Number of hits stated as a fraction of all
attempted accesses is called the hit rate.
A miss rate is the number of misses stated as a fraction of attempted accesses. Extra time
needed to bring the desired information into the cache is called the miss penalty.
assembling the operating system routine into a virtual address space , is called system
space that is separate from the virtual space in which user application program reside. The
latter space is called user space.
59.Define
A combinational circuit is a connected arrangement of logic gates with the set of inputs and
outputs. A combinational circuit transforms binary information from the given input data to the
required output data. Combinational are employed in digital computers required for generating
binary control decisions and for providing digital components required for data processing.
61. Define sequential circuits.
62.Define interface.
The word interface refers to the boundary between two circuits or devices.
PART-A
1. Explain the concept of pipelining.
Pipelining is an implementation technique whereby multiple instructions are overlapped in
execution. It takes advantage of parallelism that exists among actions needed to execute an
instruction.
7. Explain the idea behind dynamic scheduling? Also write advantages of dynamic
scheduling.
In dynamic scheduling the hardware rearranges the instruction to reduce the stalls while
maintaining dataflow and exception behavior.
Advantages: it enables handling some cases when dependence are unknown at compile time
It allows code that was compiled with one pipeline in mind and run efficiently on a different
pipeline.
PART B
Dynamic scheduling
3.Explain dynamic scheduling using tomosulo’s approach with an example. (or)
Explain technique used for overcoming the data hazards with an example.
-Diagram
4.Explain in detail about the hardware based speculation and explain how it
overcomes the control dependencies. (16)
Hardware-based speculation combines three key
ideas: Role of re-order buffer
Issue
Execute
Write result
Commit
5.Explain the basic compiler techniques for exposing ilp.
(8) Basic Pipeline Scheduling and Loop
Unrolling
6. Explain static and dynamic branch prediction schemes (16)
UNIT-III
PART-A
They uses multiple, independent functional units. Rather than attempting to issue multiple,
independent instructions to the units, a VLIW packages the multiple operations into one very
long instruction.
They are less expensive, they have cache based memory system, more parallelism.
3. What are loop carried dependence?
They focus on determining whether data accesses in later iterations are dependent on data
values produced in earlier iterations; such a dependence is called loop carried dependence.
E.g for(i=1000;i>0;i=i-1)
4. Use the G.C.D test to determine whether dependence exists in the following loop.
for(i=1;i<=100;i=i+1)
X[2*i+3]=X[2*i]*5.0; Solution:
a=2, b=3, c=2, d=0 GCD(a,c)=2
and d-b=-3
Since 2 does not divide -3, no dependence is possible.
5. What is software pipelining?
Software pipelining is a technique for reorganizing loops that each iteration in the software
pipelined code is made from instruction chosen from different iterations of the original loop.
Global code scheduling aims to compact code fragment with internal control structure into
the shortest possible sequence that preserves the data and control dependence. Finding a
shortest possible sequence is finding the shortest sequence for the critical path.
7. What is Trace?
Trace selection tries to find a likely sequence of basic blocks whose operations will be put
together into smaller number of instructions, this sequence is called trace.
The hardware model, limitation on the window size and maximum issue count, the effects of
realistic branch and jump prediction, the effects of finite registers, the effect of imperfect
alias analysis.
Within a basic block, algebraic simplification of expressions and an optimization called copy
propagation. Which eliminates operations that copy values, can be used to simplify
sequences like the following:
A special register called the current frame pointer points to the set of registers to be used
by a given procedure.
PART B
ILP in VLIW
EPIC ARCHITECTURE
UNIT-IV
PART-A
4. What are the ways to maintain cache coherence? OR what are the ways to enforce
cache coherence?
Directory based protocol, Snooping based protocol.
5. What are the ways to maintain cache coherence using snooping protocol?
Write invalidate protocol, write update or write broadcast protocol.
6. What is write invalidate and write update protocol?
Write invalidate provide exclusive access to caches. These exclusive caches ensure that
no other readable or writable copies of an item exist when the write occurs.
Write updates protocol updates all cached copies of a data item when that item is
written.
12. Explain the types of messages that can be send between the processors and directories?
Local Node: node where the requests originates
Home Node: Node where memory location and directory entry of the address resides.
Remote Note: the copy of the block in the third node called remote node.
13. What is consistency? And what are the models used for consistency?
Consistency says in what order processor must observe the data writes of another
processor.
Models used for Consistency:
Sequential Consistency
Model, Relaxed Consistency
Model
It is hard to overcome throughput losses from shorter stalls, due to pipeline start-up
costs.
Since CPU, issues instruction from 1 thread, when a stall occurs, the pipeline must
be emptied or frozen.
New thread must fill pipeline before instruction can complete.
PART B
1. Explain the symmetric shared memory architecture. Explain the snooping based
protocols with neat diagram. (16)
Symmetric Shared Memory Architectures:
Cache Coherence in Multiprocessors:
Basic Schemes for Enforcing Coherence
Snooping Protocols Write invalidate Write update protocol
State transition diagram
2.Explain the concept of distributed shared memory and also explain directory
based protocols with an example. (or) Explain the numa architecture with neat
diagram.
Directory-Based Cache-Coherence Protocols: The Basics
3.Explain the basics of multithreading and it types. (8) (or) Explain how
multihreading approach can be used to exploit thread level parallelism within
a processor? (8)
Fine-grained multithreading
Coarse-grained multithreading
Simultaneous Multithreading
Design Challenges in SMT processors
4. Explain in detail about the need for consistency models and it types.(8)
Sequential consistency
Relaxed Consistency Models:
5. Explain the performance of the symmetric shared memory with necessary graphs. (16)
6.Briefly explain multicore architecture with neat diagram and also write its applications.
(8) (or) chip multiprocessors (or) cmp
architecture. Single-core computer
CMP architecture:
CHIP multi threading:
Multi- core
architectures
7.Explain in detail about the various types of software and hardware based
multi- threading. (8)
Software and hardware
multithreading Multithreading in
Hardware Hardware Multithreading
Techniques
Cycle-by-cycle interleaving (Fine Grained
Multithreading) Block interleaving (Coarse Grain
Multithreading)
8.Explain the architectural features of ibm cell processors with neat diagram. (or) explain
cell processors. (or) explain cell broadband engine. (10)
IBM Cell Processor Applications
of cell processors Power
Processor Element (PPE)
Synergistic Processing Elements
Input/output interfaces
9.Explain simultanous multi-threading concept for converting thread level parallelism
into instruction level parallelism. (8)
Design Challenges in SMT
Potential Performance Advantages from SMT
UNIT-V
PART-A
5. What is stripping?
Spreading multiple data over multiple disks is called stripping, which automatically forces
accesses to several disks.
Disks in the configuration are mirrored or copied to another disk. With this arrangement
data on the failed disks can be replaced by reading it from the other mirrored disks.
Drawback: writing onto the disk is slower since the disks are not synchronized, seek
time will be different.
It imposes 50% space penalty hence expensive.
The sum of entry time, response time and think time is called transaction time.
9. State little law?
Little law relates the average number of tasks in the system. It relates to Average arrival
rate of new tasks with the average time to perform a task.
I/O buses – these buses are lengthy and have any types of devices connected to it. CPU
memory buses – They are short and generally of high speed.
12. What is bus master?
Bus master are devices that can initiate the read or write transaction.
Eg. Processor – processor are always has the bus mastership.
It offers higher bandwidth by using packets, as opposed to holding the bus for full
transaction.
The idea behind this is to split the bus into request and replies, so that the bus can be used in
the time between request and the reply.
Access Time: Is the time between when a read is required and when the desired word
arrives. Cycle Time: Is the minimum time between requests to memory.
PART B
1. Explain in detail about the vaious optimization techniques for improving the cache
performance. (16)
Cache performance
There are 17 cache optimizations into four categories:
First Miss Penalty Reduction Technique: Multi-Level Caches
Second Miss Penalty Reduction Technique: Critical Word First and Early Restart
Third Miss Penalty Reduction Technique: Giving Priority to Read Misses over
Writes
Fourth Miss Penalty Reduction Technique: Merging Write Buffer
Fifth Miss Penalty Reduction Technique: Victim Caches
First Miss Rate Reduction Technique: Larger Block Size
Second Miss Rate Reduction Technique: Larger caches Miss
Rate Reduction Technique: Higher Associativity
Fourth Miss Rate Reduction Technique: Way Prediction and Pseudo-Associative
Caches Fifth Miss Rate Reduction Technique: Compiler Optimizations
Magnetic
Disks Optical
Disks
Magnetic Tape
Automated Tape Libraries
Flash Memory
4. Explain the buses and i/o devices (16)