You are on page 1of 20

Instruction Set

Architectures
Source:
Null and Lobur. The Essentials of Computer Organization and
Architecture. Chapter 5

What distinguishes instruction sets


from each other?
Operand storage:
Stack structure
Registers
Both

Number of explicit operands per instruction


Operand location (combination of operands allowed per instruction)
Register-to-register
Register-to-memory
Memory-to-memory

Operations
Types of operations
Which operations are allowed to access memory

Type and size of operands


Addresses
Numbers, or
Characters?

Design decisions
Instruction set must match the architecture
Factors
Amount of space a program requires
Complexity of the instruction set
Amount of decoding necessary
Complexity of the tasks performed by the instruction

Length of the instructions


Total number of instructions

Trade-offs
Short vs. long instructions
Short:
less space and fetched quickly
Limits the number of instructions as well as the size and number of operands

Fixed length vs. variable length


Easier to decode but waste space

Memory organization affects instruction format


E.g. Is the memory byte-addressable?

Fixed length but expandable when it comes to the number of operands (expanding
opcode)
Number of addressing modes
Big-endian or little-endian?
How many registers?
How should they be organized?
How should operands be stored in the CPU?

Internal storage in the CPU


Three possible choices in how to store data in the CPU
Stack architecture

operands are in a stack (with operands at TOP)


Facilitates evaluation of expressions
No random access, therefore code produced is inefficient
Stack causes a bottleneck

Accumulator architecture
One operand is implicitly in the accumulator
Reduces complexity
Memory traffic is high

General-purpose register (GPR) architecture


Longer fetch and decode times
Useful for machines with slow memories
Three types
Memory-memory
Register-memory
Load-store

Number of operands and instruction


lengths
Fixed length
Wastes space
Fast
Better performance when pipelining is used

Variable length
More complex to decode
Saves storage space

Compromise: two or three instruction lengths. Instruction lengths must be compared to


the word length of the machine. If unaligned, space is wasted.
Common instruction formats

Opcode
Opcode
Opcode
Opcode

only
+ 1 address
+ 2 addresses
+ 3 addresses

Comparison of architectures
treatment of opcodes
Stack architecture

Opcodes take operands from top of stack


Intermediate results are put in top of stack
Provides a mechanism for parameter passing
Require a PUSH and a POP instruction, both of which are allowed a single
operand X
PUSH X retrieves a data value from memory X and places it at TOP of stack.
POP X retrieves removes the element at the TOP of stack and puts it in memory location X.

Efficient for evaluating long arithmetic expressions in reverse Polish notation


(RPN) aka postfix notation
Only certain instructions are allowed to access memory; all instructions must
use operands from the stack.
Most instructions contain only opcodes.

Expanding opcodes
Number of operands is dependent on the instruction length.
Not all instructions have the same number of operands.
Expanding opcodes: compromise solution between rich set of
opcodes (opcodes which contain a complex set of instructions) and
having short opcodes
Short opcode: many operands
Rich set: all the bits can be used for unique instructions

Example: 16-bit instruction format may have at least two possibilities


4-bit opcode followed by three 4-bit addresses (where the address is a
register and there are 16 registers)
4-bit opcode followed by a 12-bit address (where the address may be a
memory location).

Expanding opcode: Example


Specifications (instruction format has 16 bits)

15
14
31
16

instructions
instructions
instructions
instructions

with
with
with
with

three addresses
two addresses
1 address
0 addresses

Following the previous example, other patterns are


8-bit opcode with two 4-bit addresses
12-bit opcode with one 4-bit address

An implementation of the
specifications
15 3-address codes
0000 R1 R2 R3 (where each Rn is 4 bits) 1110 R1 R2 R3

14 2-address codes
1111 0000 R1 R2 --- 1111 1101 R1 R2

31 1-address codes
1111 1110 0000 R1 --- 1111 1111 1110 R1

16 0-address codes
1111 1111 1111 0000 ---- 1111 1111 1111 1111

What if the specifications are:


12-bit instruction format
There are 8 registers.
Instructions cannot access memory directly.
Will these support
4 instructions with 3 registers
255 instructions with 1 register
16 instructions with 0 registers

Instruction Types
Data Movement
Arithmetic Operations
Boolean Logic Instructions
Bit Manipulation Instructions
Input/Output Instructions
Instructions for Transfer of Control
Special Purpose Instructions
Orthogonality

No redundancy in instructions
Instruction set must be consistent
Addressing modes of operands must be independent from the operands
Consequences:
Facilitates language compiler construction
Long instruction words
Longer programs
More memory use

Address Modes
Immediate addressing: value follows the opcode
Direct addressing: value obtained by the specified memory
address in the instruction
Register addressing: register is used to specify the operand

Indirect addressing: bits in the address field specify a pointer


Register indirect addressing: register contains the pointer

Indexed addressing: index register is used to store an offset


added to the operand
Based addressing: a base address register is used to store an offset

Stack addressing: operand is assumed to be on the stack

Variants of previous addressing


modes
Indirect indexed addressing: Indirect and indexed
addressing are used at the same time
Base/offset addressing: Adds an offset to a specific base
register and adds this to the specified operand [which is
a pointer], leading to the address of the actual operand
to be used in the instruction.
Auto-increment and auto-decrement: automatically
increments or decrements the register used, thus
reducing the code size
Self-relative addressing: computes the address of the
operand from the current operation

Pipelining
Fetch-decode-execute: normally each clock pulse controls a step
What if the steps are broken down into smaller steps, and some steps
can be performed in parallel?
Ministeps:
1.
2.
3.
4.
5.
6.

Fetch instruction
Decode opcode
Calculate effective address of operands
Fetch operands
Execute instruction
Store result

Overlapping results in a speed up of execution.


The method is called pipelining. This method is used to exploit instruction
level parallelism (ILP).

How pipelining works


Cycle 1

Cycle 2

Cycle 3

Cycle 4

Cycle 5

Cycle 6

Cycle 7

Cycle 8

S1

S2

S3

S4

S5

S6

S1

S2

S3

S4

S5

S6

S1

S2

S3

S4

S5

S6

S1

S2

S3

S4

S5

Cycle 9

Instruction 1

Instruction 2

Instruction 3

Instruction 4

S6

Speedup computation
Speedup is affected by the number of stages.
For a k-stage pipeline

Assume a clock cycle time of tp, i.e. it takes tp time per stage.
Assume n instructions (tasks) to process.
Thus, it takes Task 1 k x tp time to complete
The remaining n 1 tasks emerge from the pipeline one per cycle,
thus the total time for these tasks of (n 1)tp.
Thus, to complete n tasks using a k-stage pipeline requires
(k x tp) + (n 1) tp = (k + n 1) tp
or
k + (n 1) clock cycles

Speedup computation
Without
a pipeline, for n instructions require ntn cycles,

where tn = k x tp
To compute speedup, we divide the time required if there is
no pipeline by the time required if there is a pipeline:
Thus as , we see that approaches n, which results in a
theoretical speedup of
where k is the number of stages in the pipeline.

Constraints of pipelining
Resource conflicts (structural hazards)
Data dependencies
Conditional branch statements
Proposed solutions
Branch prediction
Delayed branch (compiler resolution through a rearrangement of the
machine code)

Sample of codes with the prefix property

IP addresses

You might also like