You are on page 1of 59

UCS 8043 /UIT 8082 EMBEDDED SYSTEMS

UNIT -1

1. INTRODUCTION

What Is a Computer?

A computer is made up of hardware and software. The hardware of a


computer consists of four types of components:

✿ Processor. The processor is responsible for performing all of the


computational operations and the coordination of the usage of resources
of a computer. A computer system may consist of one or multiple
processors. A processor may perform general-purpose computations or
special-purpose computations, such as graphical rendering, printing, or
network processing.

✿ Input devices. A computer is designed to execute programs that


manipulate certain data. Input devices are needed to enter the program
to be executed and data to be processed into the computer. There are a
wide variety of input devices: keyboards, keypads, scanners, bar code
readers, sensors, and so on.

✿ Output devices. No matter if the user uses the computer to do certain


computation or to find information from the Internet or a database, the
end results must be displayed or printed on paper so that the user can see
them. There are many media and devices that can be used to present the
information: CRT displays, flat-panel displays, seven-segment displays,
printers, light-emitting diodes (LEDs), and so on.

1
✿ Memory devices. Programs to be executed and data to be processed
must be stored in memory devices so that the processor can readily
access them.

The Processor

A processor is also called the central processing unit (CPU). The processor
consists of at least the following three components:

Registers. A register is a storage location inside the CPU. It is used to


hold data and/or a memory address during the execution of an instruction.
Because the register is very close to the CPU, it can provide fast access to
operands for program execution. The number of registers varies greatly
from processor to processor.

Arithmetic logic unit (ALU). The ALU performs all the numerical
computations and logical evaluations for the processor. The ALU receives
data from the memory, performs the operations, and, if necessary, writes
the result back to the memory. Today’s supercomputer can perform
trillions of operations per second. The ALU and registers together are
referred to as the datapath of the processor.

Control unit. The control unit contains the hardware instruction logic.
The control unit decodes and monitors the execution of instructions. The
control unit also acts as an arbiter as various portions of the computer
system compete for the resources of the CPU. The activities of the CPU
are synchronized by the system clock. The clock rates of modern
microprocessors have exceeded 3.0 GHz at the time of this writing, where
1 GHz = 1 billion cycles per second The period of a 1-GHz clock signal is 1
ns (10–9 second). The control unit also maintains a register called the
program counter (PC) that keeps track of the address of the next
instruction to be executed. During the execution of an instruction, the
occurrence of an overflow, an addition carry, a subtraction borrow, and so
forth are flagged by the system and stored in another register called a

2
status register. The resultant flags are then used by the programmer for
programflow control and decision making.

The Microprocessor

The advancement of semiconductor technology allows the circuitry


of a complete processor to be placed in one integrated circuit (also called
a chip). A microprocessor is a processor packaged in a single integrated
circuit. A microcomputer is a computer that uses a microprocessor as its
CPU. A personal computer (PC) is a microcomputer. Early microcomputers
were very slow. However, many personal computers manufactured in
2003 run at a clock rate higher than 3.0 GHz and are faster than some
supercomputers of a few years ago. Depending on the number of bits that
a microprocessor can manipulate in one operation, a microprocessor is
referred to as 4-bit, 8-bit, 16-bit, 32-bit, or 64-bit. This number is the word
length (or datapath length) of the microprocessor. Currently, the most
widely used microprocessors are 8-bit.

Although the clock rate of the microprocessor has been increased


dramatically, the improvement in the access time (or simply called the
speed) of the high-capacity memory chips (especially the most widely
used DRAM chips to be discussed in Section 1.2.4) has been moderate at
best. The microprocessor may complete one arithmetic operation in one
clock cycle; however, it may take many clock cycles to access data from
the memory chip. This disparity in speed makes the high clock rate of the
microprocessor alone useless for achieving high throughput. The solution
to this issue is adding a small high-speed memory to the CPU chip. This
onchip memory is called cache memory. The CPU can access data from
the on-chip cache memory in one or two clock cycles because it is very
close to the ALU. The cache memory is effective in improving the average
memory access time because the CPU demonstrates locality in its access
behavior. Within a short period of time, the CPU tends to access a small
area in the memory repeatedly. Once the program segment or data has
been brought into the cache, it will be referenced many times. This results

3
in an average memory access time very close to that of the access time of
the cache memory.

Microprocessors and input/output (I/O) devices have different


characteristics and speed. A microprocessor is not designed to deal with
I/O devices directly. Instead, peripheral chips (also called interface chips)
are needed to make up the difference between the microprocessor and
the I/O devices. For example, the Intel i8255 was designed to interface the
8-bit 8080 microprocessor from Intel, and the M6821 was designed to
interface the 8-bit 6800 from Motorola with I/O devices.

Microprocessors have been widely used in many applications since


they were invented. However, there are several limitations in the initial
microprocessor designs that led to the development of microcontrollers:
✿ External memory chips are needed to hold programs and data because
the early microprocessors did not have on-chip memory.
✿ Glue logic (such as address decoder and buffer chips) is required to
interface with the memory chips.
✿ Peripheral chips are needed to interface with I/O devices.

Because of these limitations, a product designed with microprocessors


cannot be made as compact as might be desirable. The development of
microcontrollers has not only eliminated most of these problems but also
enabled the design of many low-cost microprocessor-based products.

Microcontrollers

A microcontroller, or MCU, is a computer implemented on a single


very large scale integrated (VLSI) circuit. In addition to those components
contained in a microprocessor, an MCU also contains some of the
following peripheral components:
✿ Memory

4
✿ Timers, including event counting, input capture, output compare, real-
time interrupt, and watchdog timer
✿ Pulse-width modulation (PWM)
✿ Analog-to-digital converter (ADC)
✿ Digital-to-analog converter (DAC)
✿ Parallel I/O interface
✿ Asynchronous serial communication interface (UART)
✿ Synchronous serial communication interfaces (SPI, I2C, and CAN)
✿ Direct memory access (DMA) controller
✿ Memory component interface circuitry
✿ Software debug support hardware

The discussion of the functions and applications of these components is


the subject of this text. Most of these functions are discussed in details in
later chapters.
Since their introduction, MCUs have been used in almost every application
that requires certain amount of intelligence. They are used as controllers
for displays, printers, keyboards, modems, charge card phones, palm-top
computers, and home appliances, such as refrigerators, washing
machines, and microwave ovens. They are also used to control the
operation of engines and machines in factories. One of the most important
applications of MCUs is probably the automobile control. Today, a
luxurious car may use more than 100 MCUs. Today, most homes have one
or more MCU-controlled consumer electronics appliances. In these
applications, people care about only the functionality of the end product
rather than the MCUs being used to perform the control function. Products

of this nature are often called embedded systems.

Embedded Systems - The New Realities

5
"At the root of cascading changes of modern economic life...devaluing the constraints of material resources, the
microchip has devalued most resources in technology, business and geopolitics...overcoming the large
accumulations of physical capital and made possible the launching of global economic enterprises...microchips
find their value not in their substance but in their intellectual content: their design..."

2. LOGIC GATES
A logic gate performs a logical operation on one or more logic inputs and produces a single
logic output. The logic normally performed is Boolean logic and is most commonly found in
digital circuits. Logic gates are primarily implemented electronically using diodes or
transistors, but can also be constructed using electromagnetic relays, fluidics, optics, or even
mechanical elements. A logic gate is a combination of different electronic components, that
takes one or more logic-level inputs and produces a single logic-level output. Because the
output is also a logic-level value, an output of one logic gate can connect to the input of one
or more other logic gates.

In electronic logic, a logic level is represented by a certain voltage (which depends on the
type of electronic logic in use). Each logic gate requires power so that it can source and sink
currents to achieve the correct output voltage. In logic circuit diagrams the power is not
shown, but in a full electronic schematic, power connections are required. There are 7
positive logic gates and each gate has two laws or rules.

Truth Table
A truth table is a table that describes the behaviour of a logic gate. It lists the value of the
output for every possible combination of the inputs and can be used to simplify the number of
logic gates and level of nesting in an electronic circuit.

6
Types
NAND and NOR logic gates are the two pillars of logic, in that all other types of Boolean
logic gates (i.e., AND, OR, NOT, XOR, XNOR) can be created from a suitable network of
just NAND or just NOR gate(s). They can be built from relays or transistors, or any other
technology that can create an inverter and a two-input AND or OR gate. Hence the NAND
and NOR gates are called the universal gates.
For an input of 2 variables, there are 16 possible boolean algebraic functions. These 16
functions are enumerated below, together with their outputs for each combination of inputs
variables.

Symbol for AND Gate Symbol for OR Gate

7
1-bit Half Adder

8
1-bit Full Adder

Left & Right Logical Shift

9
Multiplexer

Decoder

10
ALU

3. TIMING DIAGRAM

11
D FLIP-FLOPS

Undeniably, the "D" Flip-Flop it is the easiest one to work with. The name D comes from
"Data". This Flip-Flop is perfectly stable, and very easy to control. Fact is, the D Flip-Flop
is the best!

D FLIP-FLOP

Schematic Timing Diagram Truth Table Description

D Q Q
0 0 1 Fig 5: D Flip-Flop
1 1 0 Q takes the value of D on
the next falling clock edge

4. MEMORY
Programs and data are stored in memory in a computer system. A
computer may contain semiconductor, magnetic, and/or optical memories.
Only semiconductor memory is discussed in this text because magnetic
and optical memories are seldom used in 8-bit MCU applications.
Semiconductor memory can be further classified into two major types:
random-access memory (RAM) and read-only memory (ROM).

RANDOM-ACCESS MEMORY
Random-access memory is volatile in the sense that it cannot retain
data in the absence of power. RAM is also called read/write memory
because it allows the processor to read from and write into it. Both read
and write accesses to a RAM chip take roughly the same amount of time.
As long as the power is on, the microprocessor can write data into a
location in the RAM chip and read back the same contents later. Reading
memory is nondestructive. When the microprocessor writes data to
memory, the old data is written over and destroyed. There are two types
of RAM technologies: static RAM (SRAM) and dynamic RAM (DRAM). SRAM
uses from four to six transistors to store one bit of information. As long as
power is on,the information stored in the SRAM will not be degraded.

12
Dynamic RAM uses one transistor and one capacitor to store one bit of
information. The information is stored in the capacitor in the form of
electric charge. The charge stored in the capacitor will leak away over
time, so a periodic refresh operation is required to maintain the contents
of DRAM. RAM is mainly used to store dynamic programs and data. A
computer user often wants to run different programs on the same
computer, and these programs usually operate on different sets of data.
The programs and data must therefore be loaded into RAM from hard disk
or other secondary storage, and for this reason they are called dynamic.

READ-ONLY MEMORY
ROM is nonvolatile. If power is removed from ROM and then
reapplied, the original data will still be there. As its name implies, ROM
data can only be read. This is not exactly true. Most ROM technologies
require special algorithm and voltage to write data into the chip. Without
using the special algorithm and voltage, any attempt to write to the ROM
memory will not be successful. There are many different kinds of ROM
technologies in use today:
Masked-programmed read-only memory (MROM) is a type of
ROM that is programmed when it is manufactured. The semiconductor
manufacturer places binary data in the memory according to customer’s
specification. To be cost effective, many thousands of MROM memory
chips, each containing a copy of the same data (or program), must be
sold. Many people simply call MROM as ROM.

Programmable read-only memory (PROM) is a type of read-only


memory that can be programmed in the field (often by the end user)
using a device called a PROM programmer or PROM burner. Once a PROM
has been programmed, its contents cannot be changed. PROMs are fuse-
based; that is, end users program the fuses to configure the contents of
memory.

13
Erasable programmable read-only memory (EPROM) is a type
of read-only memory that can be erased by subjecting it to strong
ultraviolet light. The circuit design of EPROM requires the user to erase
the contents of a location before a new value can be written into it. A
quartz window on top of the EPROM integrated circuit permits ultraviolet
light to be shone directly on the silicon chip inside. Once the chip is
programmed, the window can be covered with dark tape to prevent
gradual erasure of the data. If no window is provided, the EPROM chip
becomes one-time programmable (OTP) only. EPROM is often used in
prototype computers where the software may be revised many times until
it is perfected. EPROM does not allow erasure of the contents of an
individual location. The only way to make change is to erase the entire
EPROM chip and reprogram it. The programming of an EPROM chip is done
electrically by using a device called an EPROM programmer. Today, most
programmers are universal in the sense that they can program many
different types of devices including EPROM, EEPROM, flash memory, and
programmable logic devices.

Electrically erasable programmable read-only memory


(EEPROM) is a type of nonvolatile memory that can be erased and
reprogrammed by electrical signals. Like EPROM, the circuit design of
EEPROM also requires the user to erase the contents of a memory location
before writing a new value into it. EEPROM allows each individual location
to be erased and reprogrammed. Unlike EPROM, EEPROM can be erased
and programmed using the same programmer. However, EEPROM pays
the price for being so flexible in its erasability. The cost of an EEPROM
chip is much higher than that of an EPROM chip of comparable density.
Flash memory was invented to incorporate the advantages and avoid the
disadvantages of both EPROM and EEPROM technologies. Flash memory
can be erased and reprogrammed in the system without using a dedicated
programmer. It achieves the density of EPROM, but it does not require a
window for erasure. Like EEPROM, flash memory can be programmed and
erased electrically. However, it does not allow the erasure of an individual

14
memory location—the user can only erase a section or the entire chip.
Today, more and more MCUs are incorporating onchip flash memory for
storing programs and data. The flash-based PIC18 MCUs allow you to
erase one block of 64 bytes at a time.

5. MICROPROCESSOR

A microprocessor is a programmable digital electronic component that incorporates the


functions of a central processing unit (CPU) on a single semiconducting integrated circuit
(IC). The microprocessor was born by reducing the word size of the CPU from 32 bits to 4
bits, so that the transistors of its logic circuits would fit onto a single part. One or more
microprocessors typically serve as the CPU in a computer system, embedded system, or
handheld device. Microprocessors made possible the advent of the microcomputer in the mid-
1970s. Before this period, electronic CPUs were typically made from bulky discrete
switching devices (and later small-scale integrated circuits) containing the equivalent of only
a few transistors. By integrating the processor onto one or a very few large-scale integrated
circuit packages (containing the equivalent of thousands or millions of discrete transistors),
the cost of processing capacity was greatly reduced. Since the advent of the IC in the mid-
1970s, the microprocessor has become the most prevalent implementation of the CPU, nearly
completely replacing all other forms. See History of computing hardware for pre-electronic
and early electronic computers.

Since the early 1970s, the increase in processing capacity of evolving microprocessors has
been known to generally follow Moore's Law. It suggests that the complexity of an integrated
circuit, with respect to minimum component cost, doubles every 18 months. In the early
1990s, microprocessor's heat generation (TDP) - due to current leakage - emerged, as a
leading developmental constraint. From their humble beginnings as the drivers for
calculators, the continued increase in processing capacity has led to the dominance of
microprocessors over every other form of computer; every system from the largest
mainframes to the smallest handheld computers now uses a microprocessor at its core.

Notable 8-bit designs

The 4004 was later followed in 1972 by the 8008, the world's first 8-bit microprocessor.
These processors are the precursors to the very successful Intel 8080 (1974), Zilog Z80

15
(1976), and derivative Intel 8-bit processors. The competing Motorola 6800 was released
August 1974. Its architecture was cloned and improved in the MOS Technology 6502 in
1975, rivaling the Z80 in popularity during the 1980s.

Both the Z80 and 6502 concentrated on low overall cost, through a combination of small
packaging, simple computer bus requirements, and the inclusion of circuitry that would
normally have to be provided in a separate chip (for instance, the Z80 included a memory
controller). It was these features that allowed the home computer "revolution" to take off in
the early 1980s, eventually delivering such inexpensive machines as the Sinclair ZX-81,
which sold for US$99.

The Western Design Center, Inc. (WDC) introduced the CMOS 65C02 in 1982 and licensed
the design to several companies which became the core of the Apple IIc and IIe personal
computers, medical implantable grade pacemakers and defibrilators, automotive, industrial
and consumer devices.WDC pioneered the licensing of microprocessor technology which
was later followed by ARM and other microprocessor Intellectual Property (IP) providers in
the 1990’s.

Motorola trumped the entire 8-bit world by introducing the MC6809 in 1978, arguably one of
the most powerful, orthogonal, and clean 8-bit microprocessor designs ever fielded – and also
one of the most complex hard-wired logic designs that ever made it into production for any
microprocessor. Microcoding replaced hardwired logic at about this point in time for all
designs more powerful than the MC6809 – specifically because the design requirements were
getting too complex for hardwired logic.

Another early 8-bit microprocessor was the Signetics 2650, which enjoyed a brief flurry of
interest due to its innovative and powerful instruction set architecture.

A seminal microprocessor in the world of spaceflight was RCA's RCA 1802 (aka CDP1802,
RCA COSMAC) (introduced in 1976) which was used in NASA's Voyager and Viking
spaceprobes of the 1970s, and onboard the Galileo probe to Jupiter (launched 1989, arrived
1995). RCA COSMAC was the first to implement C-MOS technology. The CDP1802 was
used because it could be run at very low power,* and because its production process (Silicon
on Sapphire) ensured much better protection against cosmic radiation and electrostatic

16
discharges than that of any other processor of the era. Thus, the 1802 is said to be the first
radiation-hardened microprocessor.

16-bit designs

The first multi-chip 16-bit microprocessor was the National Semiconductor IMP-16,
introduced in early 1973. An 8-bit version of the chipset was introduced in 1974 as the IMP-
8. During the same year, National introduced the first 16-bit single-chip microprocessor, the
National Semiconductor PACE, which was later followed by an NMOS version, the
INS8900.

Other early multi-chip 16-bit microprocessors include one used by Digital Equipment
Corporation (DEC) in the LSI-11 OEM board set and the packaged PDP 11/03
minicomputer, and the Fairchild Semiconductor MicroFlame 9440, both of which were
introduced in the 1975 to 1976 timeframe.

The first single-chip 16-bit microprocessor was TI's TMS 9900, which was also compatible
with their TI-990 line of minicomputers. The 9900 was used in the TI 990/4 minicomputer,
the TI-99/4A home computer, and the TM990 line of OEM microcomputer boards. The chip
was packaged in a large ceramic 64-pin DIP package, while most 8-bit microprocessors such
as the Intel 8080 used the more common, smaller, and less expensive plastic 40-pin DIP. A
follow-on chip, the TMS 9980, was designed to compete with the Intel 8080, had the full TI
990 16-bit instruction set, used a plastic 40-pin package, moved data 8 bits at a time, but
could only address 16 KiB. A third chip, the TMS 9995, was a new design. The family later
expanded to include the 99105 and 99110.

The Western Design Center, Inc. (WDC) introduced the CMOS 65816 16-bit upgrade of the
WDC CMOS 65C02 in 1984. The 65816 16-bit microprocessor was the core of the Apple
IIgs and later the Super Nintendo Entertainment System, making it one of the most popular
16-bit designs of all time.

Intel followed a different path, having no minicomputers to emulate, and instead "upsized"
their 8080 design into the 16-bit Intel 8086, the first member of the x86 family which powers
most modern PC type computers. Intel introduced the 8086 as a cost effective way of porting
software from the 8080 lines, and succeeded in winning much business on that premise. The

17
8088, a version of the 8086 that used an external 8-bit data bus, was the microprocessor in the
first IBM PC, the model 5150. Following up their 8086 and 8088, Intel released the 80186,
80286 and, in 1985, the 32-bit 80386, cementing their PC market dominance with the
processor family's backwards compatibility.

The integrated microprocessor memory management unit (MMU) was developed by Childs
et al. of Intel, and awarded US patent number 4,442,484.

32-bit designs

16-bit designs were in the market only briefly when full 32-bit implementations started to
appear.

The most significant of the 32-bit designs is the MC68000, introduced in 1979. The 68K, as it
was widely known, had 32-bit registers but used 16-bit internal data paths, and a 16-bit
external data bus to reduce pin count, and supported only 24-bit addresses. Motorola
generally described it as a 16-bit processor, though it clearly has 32-bit architecture. The
combination of high speed, large (16 mebibytes) memory space and fairly low costs made it
the most popular CPU design of its class. The Apple Lisa and Macintosh designs made use of
the 68000, as did a host of other designs in the mid-1980s, including the Atari ST and
Commodore Amiga.

The world's first single-chip fully-32-bit microprocessor, with 32-bit data paths, 32-bit buses,
and 32-bit addresses, was the AT&T Bell Labs BELLMAC-32A, with first samples in 1980,
and general production in 1982 (See this bibliographic reference and this general reference).
After the divestiture of AT&T in 1984, it was renamed the WE 32000 (WE for Western
Electric), and had two follow-on generations, the WE 32100 and WE 32200. These
microprocessors were used in the AT&T 3B5 and 3B15 minicomputers; in the 3B2, the
world's first desktop supermicrocomputer; in the "Companion", the world's first 32-bit laptop
computer; and in "Alexander", the world's first book-sized supermicrocomputer, featuring
ROM-pack memory cartridges similar to today's gaming consoles. All these systems ran the
UNIX System V operating system.

Intel's first 32-bit microprocessor was the iAPX 432, which was introduced in 1981 but was
not a commercial success. It had an advanced capability-based object-oriented architecture,

18
but poor performance compared to other competing architectures such as the Motorola
68000.

Motorola's success with the 68000 led to the MC68010, which added virtual memory support.
The MC68020, introduced in 1985 added full 32-bit data and address busses. The 68020
became hugely popular in the Unix supermicrocomputer market, and many small companies
(e.g., Altos, Charles River Data Systems) produced desktop-size systems. Following this with
the MC68030, which added the MMU into the chip, the 68K family became the processor for
everything that wasn't running DOS. The continued success led to the MC68040, which
included an FPU for better math performance. A 68050 failed to achieve its performance
goals and was not released, and the follow-up MC68060 was released into a market saturated
by much faster RISC designs. The 68K family faded from the desktop in the early 1990s.

Other large companies designed the 68020 and follow-ons into embedded equipment. At one
point, there were more 68020s in embedded equipment than there were Intel Pentiums in PCs
(See this webpage for this embedded usage information). The ColdFire processor cores are
derivatives of the venerable 68020.

During this time (early to mid 1980s), National Semiconductor introduced a very similar 16-
bit pinout, 32-bit internal microprocessor called the NS 16032 (later renamed 32016), the full
32-bit version named the NS 32032, and a line of 32-bit industrial OEM microcomputers. By
the mid-1980s, Sequent introduced the first symmetric multiprocessor (SMP) server-class
computer using the NS 32032. This was one of the design's few wins, and it disappeared in
the late 1980s.

The MIPS R2000 (1984) and R3000 (1989) were highly successful 32-bit RISC
microprocessors. They were used in high-end workstations and servers by SGI, among
others.

Other designs included the interesting Zilog Z8000, which arrived too late to market to stand
a chance and disappeared quickly.

In the late 1980s, "microprocessor wars" started killing off some of the microprocessors.
Apparently, with only one major design win, Sequent, the NS 32032 just faded out of
existence, and Sequent switched to Intel microprocessors.

19
From 1985 to 2003, the 32-bit x86 architectures became increasingly dominant in desktop,
laptop, and server markets, and these microprocessors became faster and more capable. Intel
had licensed early versions of the architecture to other companies, but declined to license the
Pentium, so AMD and Cyrix built later versions of the architecture based on their own
designs. During this span, these processors increased in complexity (transistor count) and
capability (instructions/second) by at least a factor of 1000. Intel's Pentium line is probably
the most famous and recognizable 32-bit processor model, at least with the public at large.

6. MICROPROCESSORS BUSES

Data Bus: 8/16/32 bits wide; carries data and instructions.


Address Bus: 16/24/32 bits wide; carries addresses for data and instructions.
Control Bus: 3-16 bits wide; performs management of flow of data.

20
Input/Output Instructions and Memory Mapping

On Motorola microcontrollers and microprocessors (such as MC6802 and MC68HC11, which


we examine here), there are no separate I/O instructions; all I/O devices are Memory mapped.
Access to such devices is done through same instructions as reading or writing to RAM or ROM
memory. In Intel processors (such as the i8086 on the D6 boards used in other courses), there is a
separate I/O address space, accessed via separate I/O instructions (INB, OUTB, etc). However,
the same address and data buses are used for the actual transactions.

DIRECT MEMORY ACCESS

Direct memory access (DMA) is a feature of modern computers that allows certain
hardware subsystems within the computer to access system memory for reading and/or
writing independently of the central processing unit. Many hardware systems use DMA
including disk drive controllers, graphics cards, network cards, and sound cards. Computers
that have DMA channels can transfer data to and from devices with much less CPU overhead
than computers without a DMA channel.

Without DMA, using programmed input/output (PIO) mode, the CPU typically has to be
occupied for the entire time it's performing a transfer. With DMA, the CPU would initiate the
transfer, do other operations while the transfer is in progress, and receive an interrupt from
the DMA controller once the operation has been done. This is especially useful in real-time
computing applications where not stalling behind concurrent operations is critical.

Principle

DMA is an essential feature of all modern computers, as it allows devices to transfer data
without subjecting the CPU to a heavy overhead. Otherwise, the CPU would have to copy
each piece of data from the source to the destination. This is typically slower than copying
normal blocks of memory since access to I/O devices over a peripheral bus is generally
slower than normal system RAM. During this time the CPU would be unavailable for any
other tasks involving CPU bus access, although it could continue doing any work which did
not require bus access.

A DMA transfer essentially copies a block of memory from one device to another. While the
CPU initiates the transfer, it does not execute it. For so-called "third party" DMA, as is

21
normally used with the ISA bus, the transfer is performed by a DMA controller which is
typically part of the motherboard chipset. More advanced bus designs such as PCI typically
use bus mastering DMA, where the device takes control of the bus and performs the transfer
itself.

A typical usage of DMA is copying a block of memory from system RAM to or from a buffer
on the device. Such an operation does not stall the processor, which as a result can be
scheduled to perform other tasks. DMA transfers are essential to high performance embedded
systems. It is also essential in providing so-called zero-copy implementations of peripheral
device drivers as well as functionalities such as network packet routing, audio playback and
streaming video.

Cache coherency problem

DMA can lead to cache coherency problems. Imagine a CPU equipped with a cache and an
external memory, which can be accessed directly by devices using DMA. When the CPU
accesses location X in the memory, the current value will be stored in the cache. Subsequent
operations on X will update the cached copy of X, but not the external memory version of X.
If the cache is not flushed to the memory before the next time a device tries to access X, the
device will receive a stale value of X.

Similarly, if the cached copy of X is not invalidated when a device writes a new value to the
memory, then the CPU will operate on a stale value of X.

DMA engines

In addition to hardware interaction, DMA can also be used to offload expensive memory
operations, such as large copies or scatter-gather operations, from the CPU to a dedicated
DMA engine. While normal memory copies are typically too small to be worthwhile to
offload on today's desktop computers, they are frequently offloaded on embedded devices
due to more limited resources.

22
Newer Intel Xeon processors also include a DMA engine technology called I/OAT, meant to
improve network performance on high-throughput network interfaces, in particular gigabit
Ethernet and faster.[2] However, various benchmarks with this approach by Intel's Linux
kernel developer Andrew Grover indicate no more than 10% improvement in CPU utilization
with receiving workloads, and no improvement when transmitting data.

Reconfigurable DMA circuits, for instance, based on GAG Generic Address Generators,
provide the enabling technology of Auto-sequencing memory, programmable by Flowware to
generate the data streams for running system architectures based on the anti machine
paradigm, which could be called a DMA engine.

DMA (Direct Memory Access) Controller

In some specialized situations, such as where a set of data must be transferred to a


communications IO device, a DMA controller may be present that can automatically detect
when the IO device is ready for more data, and transfer that data. This technique may be used
in conjunction with many of the other techniques, for instance an interrupt may be used when
the data transfer is complete.

Pros:

• This provides the best performance, since the I/O can happen in parallel with other
code execution

Cons:

• Only applicable to a limited range of problems


• Not all systems have DMA controllers. This is especially true of the more basic 8-bit
microcontrollers.
• Parallel nature may complicate a system

7. INTERRUPT

In computing, an interrupt is an asynchronous signal from hardware indicating the need for
attention or a synchronous event in software indicating the need for a change in execution. A

23
hardware interrupt causes the processor to save its state of execution via a context switch,
and begin execution of an interrupt handler. Software interrupts are usually implemented as
instructions in the instruction set, which cause a context switch to an interrupt handler similar
to a hardware interrupt. Interrupts are a commonly used technique for computer multitasking,
especially in real-time computing. Such a system is said to be interrupt-driven.

An act of interrupting is referred to as an interrupt request ("IRQ").

Overview

Hardware interrupts were introduced as a way to avoid wasting the processor's valuable time
in polling loops, waiting for external events.

Interrupts may be implemented in hardware as a distinct system with control lines, or they
may be integrated into the memory subsystem.

If implemented in hardware, an interrupt controller circuit such as the IBM PC's


Programmable Interrupt Controller (PIC) may be connected between the interrupting device
and to the processor's interrupt pin to multiplex several sources of interrupt onto the one or
two CPU lines typically available.

If implemented as part of the memory controller, interrupts are mapped into the system's
memory address space.

Interrupts can be categorized into: maskable interrupt (IRQ), non-maskable interrupt (NMI),
interprocessor interrupt (IPI), software interrupt, and spurious interrupt.

• A maskable interrupt (IRQ) is a hardware interrupt that may be ignored by setting a


bit in an interrupt mask register's (IMR) bit-mask.

• Likewise, a non-maskable interrupt (NMI) is a hardware interrupt that typically does


not have a bit-mask associated with it allowing it to NOT be ignored. NMIs are often
used for timers, especially watchdog timers.

• An interprocessor interrupt is a special case of interrupt that is generated by one


processor to interrupt another processor in a multiprocessor system.

24
• A software interrupt is an interrupt generated within a processor by executing an
instruction. Software interrupts are often used to implement System calls because they
implement a subroutine call with a CPU ring level change.

• A spurious interrupt is a hardware interrupt that is unwanted. They are typically


generated by system conditions such as electrical interference on an interrupt line or
through incorrectly designed hardware.

Processors typically have an internal interrupt mask which allows software to ignore all
external hardware interrupts while it is set. This mask may offer faster access than accessing
an interrupt mask register (IMR) in a PIC, or disabling interrupts in the device itself. In some
cases, such as the x86 architecture, disabling and enabling interrupts on the processor itself
acts as a memory barrier, in which case it may actually be slower.

25
8. MICROPROCESSOR ARCHITECTURE

Control Unit
Generates signals within uP to carry out the instruction, which has been decoded. In reality
causes certain connections between blocks of the uP to be opened or closed, so that data goes
where it is required, and so that ALU operations occur.
Arithmetic Logic Unit

26
The ALU performs the actual numerical and logic operation such as ‘add’, ‘subtract’, ‘AND’,
‘OR’, etc. Uses data from memory and from Accumulator to perform arithmetic. Always
stores result of operation in Accumulator.

Registers
The 8085/8080A-programming model includes six registers, one accumulator, and one flag
register, as shown in Figure. In addition, it has two 16-bit registers: the stack pointer and the
program counter. They are described briefly as follows.
The 8085/8080A has six general-purpose registers to store 8-bit data; these are identified as
B,C,D,E,H, and L as shown in the figure. They can be combined as register pairs - BC, DE,
and HL - to perform some 16-bit operations. The programmer can use these registers to store
or copy data into the registers by using data copy instructions.

Accumulator
The accumulator is an 8-bit register that is a part of arithmetic/logic unit (ALU). This register
is used to store 8-bit data and to perform arithmetic and logical operations. The result of an
operation is stored in the accumulator. The accumulator is also identified as register A.

Flags
The ALU includes five flip-flops, which are set or reset after an operation according to data
conditions of the result in the accumulator and other registers. They are called Zero(Z), Carry
(CY), Sign (S), Parity (P), and Auxiliary Carry (AC) flags; they are listed in the Table and
their bit positions in the flag register are shown in the Figure below. The most commonly
used flags are Zero, Carry, and Sign. The microprocessor uses these flags to test data
conditions.

For example, after an addition of two numbers, if the sum in the accumulator id larger than
eight bits, the flip-flop uses to indicate a carry -- called the Carry flag (CY) – is set to one.
When an arithmetic operation results in zero, the flip-flop called the Zero(Z) flag is set to
one. The first Figure shows an 8-bit register, called the flag register, adjacent to the
accumulator. However, it is not used as a register; five bit positions out of eight are used to
store the outputs of the five flip-flops. The flags are stored in the 8-bit register so that the

27
programmer can examine these flags (data conditions) by accessing the register through an
instruction.

These flags have critical importance in the decision-making process of the microprocessor.
The conditions (set or reset) of the flags are tested through the software instructions. For
example, the instruction JC (Jump on Carry) is implemented tochange the sequence of a
program when CY flag is set. The thorough understanding of flag is essential in writing
assembly language programs.

Program Counter (PC)


This 16-bit register deals with sequencing the execution of instructions. This register is a
memory pointer. Memory locations have 16-bit addresses, and that is why this is a 16-bit
register.

The microprocessor uses this register to sequence the execution of the instructions.The
function of the program counter is to point to the memory address from which the next byte is
to be fetched. When a byte (machine code) is being fetched, the program counter is
incremented by one to point to the next memory location

Stack Pointer (SP)


The stack pointer is also a 16-bit register used as a memory pointer. It points to a memory
location in R/W memory, called the stack. The beginning of the stack is defined by loading
16-bit address in the stack pointer. The stack concept is explained in the chapter "Stack and
Subroutines."

Instruction Register/Decoder
Temporary store for the current instruction of a program. Latest instruction sent here from
memory prior to execution. Decoder then takes instruction and ‘decodes’ or interprets the
instruction. Decoded instruction then passed to next stage.

Memory Address Register


Holds address, received from PC, of next program instruction. Feeds the address bus with
addresses of location of the program under execution.

28
Control Generator
Generates signals within uP to carry out the instruction which has been decoded. In reality
causes certain connections between blocks of the uP to be opened or closed, so that data goes
where it is required, and so that ALU operations occur.

Register Selector
This block controls the use of the register stack in the example. Just a logic circuit which
switches between different registers in the set will receive instructions from Control Unit.

General Purpose Registers


μP requires extra registers for versatility. Can be used to store additional data during a
program. More complex processors may have a variety of differently named registers.

Microprogramming
How does the μP knows what an instruction means, especially when it is only a binary
number? The microprogram in a uP/uC is written by the chip designer and tells the uP/uC the
meaning of each instruction uP/uC can then carry out operation.

2. 8085 System Bus

Typical system uses a number of busses, collection of wires, which transmit binary numbers,
one bit per wire. A typical microprocessor communicates with memory and other devices
(input and output) using three busses: Address Bus, Data Bus and Control Bus.

Address Bus
One wire for each bit, therefore 16 bits = 16 wires. Binary number carried alerts memory to
‘open’ the designated box. Data (binary) can then be put in or taken out.The Address Bus
consists of 16 wires, therefore 16 bits. Its "width" is 16 bits. A 16 bit binary number allows
216 different numbers, or 32000 different numbers, ie 0000000000000000 up to
1111111111111111. Because memory consists of boxes, each with a unique address, the size
of the address bus determines the size of memory, which can be used. To communicate with
memory the microprocessor sends an address on the address bus, eg 0000000000000011 (3 in

29
decimal), to the memory. The memory the selects box number 3 for reading or writing data.
Address bus is unidirectional, ie numbers only sent from microprocessor to memory, not
other way.

Data Bus
Data Bus: carries ‘data’, in binary form, between μP and other external units, such as
memory. Typical size is 8 or 16 bits. Size determined by size of boxes in memory and μP
size helps determine performance of μP. The Data Bus typically consists of 8 wires.
Therefore, 28 combinations of binary digits. Data bus used to transmit "data", ie information,
results of arithmetic, etc, between memory and the microprocessor. Bus is bi-directional. Size
of the data bus determines what arithmetic can be done. If only 8 bits wide then largest
number is 11111111 (255 in decimal). Therefore, larger number have to be broken down into
chunks of 255. This slows microprocessor. Data Bus also carries instructions from memory to
the microprocessor. Size of the bus therefore limits the number of possible instructions to
256, each specified by a separate number.

Control Bus
Control Bus are various lines which have specific functions for coordinating and controlling
uP operations. Eg: Read/NotWrite line, single binary digit. Control whether memory is being
‘written to’ (data stored in mem) or ‘read from’ (data taken out of mem) 1 = Read, 0 = Write.
May also include clock line(s) for timing/synchronising, ‘interrupts’, ‘reset’ etc. Typically μP
has 10 control lines. Cannot function correctly without these vital control signals. The
Control Bus carries control signals partly unidirectional, partly bi-directional. Control signals
are things like "read or write". This tells memory that we are either reading from a location,
specified on the address bus, or writing to a location specified. Various other signals to
control and coordinate the operation of the system.

Modern day microprocessors, like 80386, 80486 have much larger busses. Typically 16 or 32
bit busses, which allow larger number of instructions, more memory location, and faster
arithmetic. Microcontrollers organized along same lines, except: because microcontrollers
have memory etc inside the chip, the busses may all be internal. In the microprocessor the
three busses are external to the chip (except for the internal data bus). In case of external

30
busses, the chip connects to the busses via buffers, which are simply an electronic connection
between external bus and the internal data bus.

3. 8085 Pin description.


Properties
Single + 5V Supply
4 Vectored Interrupts (One is Non Maskable)
Serial In/Serial Out Port
Decimal, Binary, and Double Precision Arithmetic
Direct Addressing Capability to 64K bytes of memory
The Intel 8085A is a new generation, complete 8 bit parallel central processing unit
(CPU). The 8085A uses a multiplexed data bus. The address is split between the 8bit
address bus and the 8bit data bus. Figures are at the end of the document.

Pin Description
The following describes the function of each pin:
A6 - A1s (Output 3 State)
Address Bus; The most significant 8 bits of the memory address or the 8 bits of the I/0
address,3 stated during Hold and Halt modes.

AD0 - 7 (Input/Output 3state)


Multiplexed Address/Data Bus; Lower 8 bits of the memory address (or I/0 address)
appear on the bus during the first clock cycle of a machine state. It then becomes the
data bus during the second and third clock cycles. 3 stated during Hold and Halt
modes.

ALE (Output)
Address Latch Enable: It occurs during the first clock cycle of a machine state and
enables the address to get latched into the on chip latch of peripherals. The falling
edge of ALE is set to guarantee setup and hold times for the address information.
ALE can also be used to strobe the status information. ALE is never 3stated.

31
SO, S1 (Output)
Data Bus Status. Encoded status of the bus cycle:
S1 S0
O O HALT
0 1 WRITE
1 0 READ
1 1 FETCH
S1 can be used as an advanced R/W status.

RD (Output 3state)
READ; indicates the selected memory or 1/0 device is to be read and that the Data
Bus is available for the data transfer.

WR (Output 3state)
WRITE; indicates the data on the Data Bus is to be written into the selected memory
or 1/0 location. Data is set up at the trailing edge of WR. 3stated during Hold and Halt
modes.

READY (Input)
If Ready is high during a read or write cycle, it indicates that the memory or
peripheral is ready to send or receive data. If Ready is low, the CPU will wait for Ready to go
high before completing the read or write cycle.

HOLD (Input)
HOLD; indicates that another Master is requesting the use of the Address and Data Buses.
The CPU, upon receiving the Hold request. will relinquish the use of buses as soon as the
completion of the current machine cycle. Internal processing can continue. The processor can
regain the buses only after the Hold is removed. When the Hold is acknowledged, the
Address, Data, RD, WR, and IO/M lines are 3stated.

32
HLDA (Output)
HOLD ACKNOWLEDGE; indicates that the CPU has received the Hold request and that it
will relinquish the buses in the next clock cycle. HLDA goes low after the Hold request is
removed. The CPU takes the buses one half clock cycle after HLDA goes low.

INTR (Input)
INTERRUPT REQUEST; is used as a general purpose interrupt. It is sampled only during the
next to the last clock cycle of the instruction. If it is active, the Program Counter (PC) will be
inhibited from incrementing and an INTA will be issued. During this cycle a RESTART or
CALL instruction can be inserted to jump to the interrupt service routine. The INTR is
enabled and disabled by software. It is disabled by Reset and immediately after an interrupt is
accepted.

INTA (Output)
INTERRUPT ACKNOWLEDGE; is used instead of (and has the same timing as) RD during
the Instruction cycle after an INTR is accepted. It can be used to activate the 8259 Interrupt
chip or some other interrupt port.
RST 5.5
RST 6.5 - (Inputs)
RST 7.5
RESTART INTERRUPTS; These three inputs have the same timing as I NTR except they
cause an internal RESTART to be automatically inserted.
RST 7.5 ~~ Highest Priority
RST 6.5
RST 5.5 o Lowest Priority
The priority of these interrupts is ordered as shown above. These interrupts have a higher
priority than the INTR.

TRAP (Input)
Trap interrupt is a nonmaskable restart interrupt. It is recognized at the same time as INTR. It
is unaffected by any mask or Interrupt Enable. It has the highest priority of any interrupt.

33
RESET IN (Input)
Reset sets the Program Counter to zero and resets the Interrupt Enable and HLDA flipflops.
None of the other flags or registers (except the instruction register) are affected The CPU is
held in the reset condition as long as Reset is applied.

RESET OUT (Output)


Indicates CPlJ is being reset. Can be used as a system RESET. The signal is synchronized to
the processor clock.

X1, X2 (Input)
Crystal or R/C network connections to set the internal clock generator X1 can also be an
external clock input instead of a crystal. The input frequency is divided by 2 to give the
internal operating frequency.

CLK (Output)
Clock Output for use as a system clock when a crystal or R/ C network is used as an input to
the CPU. The period of CLK is twice the X1, X2 input period.

IO/M (Output)
IO/M indicates whether the Read/Write is to memory or l/O Tristated during Hold and Halt
modes.

SID (Input)
Serial input data line The data on this line is loaded into accumulator bit 7 whenever a RIM
instruction is executed.

SOD (output)
Serial output data line. The output SOD is set or reset as specified by the SIM instruction.

Vcc
+5 volt supply.

34
Vss

Ground Reference.

8085 Functional Description


The 8085A is a complete 8 bit parallel central processor. It requires a single +5 volt supply.
Its basic clock speed is 3 MHz thus improving on the present 8080's performance with higher
system speed. Also it is designed to fit into a minimum system of three IC's: The CPU, a
RAM/ IO, and a ROM or PROM/IO chip.

The 8085A uses a multiplexed Data Bus. The address is split between the higher 8bit Address
Bus and the lower 8bit Address/Data Bus. During the first cycle the address is sent out. The

35
lower 8bits are latched into the peripherals by the Address Latch Enable (ALE). During the
rest of the machine cycle the Data Bus is used for memory or l/O data.

The 8085A provides RD, WR, and lO/Memory signals for bus control. An Interrupt
Acknowledge signal (INTA) is also provided. Hold, Ready, and all Interrupts are
synchronized. The 8085A also provides serial input data (SID) and serial output data (SOD)
lines for simple serial interface. In addition to these features, the 8085A has three maskable,
restart interrupts and one non-maskable trap interrupt. The 8085A provides RD, WR and
IO/M signals for Bus control.

Status Information
Status information is directly available from the 8085A. ALE serves as a status strobe. The
status is partially encoded, and provides the user with advanced timing of the type of bus
transfer being done. IO/M cycle status signal is provided directly also. Decoded So, S1
Carries the following status information:

HALT, WRITE, READ, FETCH


S1 can be interpreted as R/W in all bus transfers. In the 8085A the 8 LSB of address are
multiplexed with the data instead of status. The ALE line is used as a strobe to enter the
lower half of the address into the memory or peripheral address latch. This also frees extra
pins for expanded interrupt capability.

Interrupt and Serial l/O


The8085A has5 interrupt inputs: INTR, RST5.5, RST6.5, RST 7.5, and TRAP. INTR is
identical in function to the 8080 INT. Each of the three RESTART inputs, 5.5, 6.5. 7.5, has a
programmable mask. TRAP is also a RESTART interrupt except it is nonmaskable. The three
RESTART interrupts cause the internal execution of RST (saving the program counter in the
stack and branching to the RESTART address) if the interrupts are enabled and if the
interrupt mask is not set. The non-maskable TRAP causes the internal execution of a RST
independent of the state of the interrupt enable or masks.

The interrupts are arranged in a fixed priority that determines which interrupt is to be
recognized if more than one is pending as follows: TRAP highest priority, RST 7.5, RST 6.5,

36
RST 5.5, INTR lowest priority This priority scheme does not take into account the priority of
a routine that was started by a higher priority interrupt. RST 5.5 can interrupt a RST 7.5
routine if the interrupts were re-enabled before the end of the RST 7.5 routine. The TRAP
interrupt is useful for catastrophic errors such as power failure or bus error. The TRAP input
is recognized just as any other interrupt but has the highest priority. It is not affected by any
flag or mask. The TRAP input is both edge and level sensitive.

Basic System Timing


The 8085A has a multiplexed Data Bus. ALE is used as a strobe to sample the lower 8bits of
address on the Data Bus. Figure 2 shows an instruction fetch, memory read and l/ O write
cycle (OUT). Note that during the l/O write and read cycle that the l/O port address is copied
on both the upper and lower half of the address. As in the 8080, the READY line is used to
extend the read and write pulse lengths so that the 8085A can be used with slow memory.
Hold causes the CPU to relingkuish the bus when it is through with it by floating the Address
and Data Buses.

System Interface
8085A family includes memory components, which are directly compatible to the 8085A
CPU. For example, a system consisting of the three chips, 8085A, 8156, and 8355 will have
the following features:
· 2K Bytes ROM
· 256 Bytes RAM
· 1 Timer/Counter
· 4 8bit l/O Ports
· 1 6bit l/O Port
· 4 Interrupt Levels
· Serial In/Serial Out Ports
In addition to standard l/O, the memory mapped I/O offers an efficient l/O addressing
technique. With this technique, an area of memory address space is assigned for l/O address,
thereby, using the memory address for I/O manipulation. The 8085A CPU can also interface
with the standard memory that does not have the multiplexed address/data bus.

37
The 8085 Programming Model
In the previous tutorial we described the 8085 microprocessor registers in reference to the
internal data operations. The same information is repeated here briefly to provide the
continuity and the context to the instruction set and to enable the readers who prefer to focus
initially on the programming aspect of the microprocessor.
The 8085 programming model includes six registers, one accumulator, and one flag register,
as shown in Figure. In addition, it has two 16-bit registers: the stack pointer and the program
counter. They are described briefly as follows.

38
Registers
The 8085 has six general-purpose registers to store 8-bit data; these are identified as
B,C,D,E,H, and L as shown in the figure. They can be combined as register pairs - BC, DE,
and HL - to perform some 16-bit operations. The programmer can use these registers to store
or copy data into the registers by using data copy instructions.

Accumulator
The accumulator is an 8-bit register that is a part of arithmetic/logic unit (ALU). This register
is used to store 8-bit data and to perform arithmetic and logical operations. The result of an
operation is stored in the accumulator. The accumulator is alsoidentified as register A.

Flags
The ALU includes five flip-flops, which are set or reset after an operation according to data
conditions of the result in the accumulator and other registers. They are called Zero(Z), Carry
(CY), Sign (S), Parity (P), and Auxiliary Carry (AC) flags; their bit positions in the flag
register are shown in the Figure below. The most commonly used flags are Zero, Carry, and
Sign. The microprocessor uses these flags to test data conditions.
Program Counter (PC)
This 16-bit register deals with sequencing the execution of instructions. This register is a
memory pointer. Memory locations have 16-bit addresses, and that is why this is a 16-bit
register.

39
The microprocessor uses this register to sequence the execution of the instructions. The
function of the program counter is to point to the memory address from which the next byte is
to be fetched. When a byte (machine code) is being fetched, the program counter is
incremented by one to point to the next memory location

Stack Pointer (SP)


The stack pointer is also a 16-bit register used as a memory pointer. It points to a memory
location in R/W memory, called the stack. The beginning of the stack is defined by loading
16-bit address in the stack pointer.
This programming model will be used in subsequent tutorials to examine how these registers
are affected after the execution of an instruction.

The 8085 Addressing Modes


The instructions MOV B, A or MVI A, 82H are to copy data from a source into a destination.
In these instructions the source can be a register, an input port, or an 8-bit number (00H to
FFH). Similarly, a destination can be a register or an output port. The sources and destination
are operands. The various formats for specifying operands are called the ADDRESSING
MODES. For 8085, they are:
1. Immediate addressing.
2. Register addressing.
3. Direct addressing.
4. Indirect addressing.
Immediate addressing
Data is present in the instruction. Load the immediate data to the destination provided.
Example: MVI R,data
Register addressing
Data is provided through the registers.
Example: MOV Rd, Rs
Direct addressing
Used to accept data from outside devices to store in the accumulator or send the data stored
in the accumulator to the outside device. Accept the data from the port 00H and store them
into the accumulator or Send the data from the accumulator to the port 01H.
Example: IN 00H or OUT 01H

40
Indirect Addressing
This means that the Effective Address is calculated by the processor. And the contents of the
address (and the one following) is used to form a second address. The second address is
where the data is stored. Note that this requires several memory accesses; two accesses to
retrieve the 16-bit address and a further access (or accesses)
to retrieve the data which is to be loaded into the register.

9. INTERRUPT BASICS

An interrupt that leaves the machine in a well-defined state is called a precise interrupt.
Such an interrupt has four properties:

• The Program Counter (PC) is saved in a known place.

• All instructions before the one pointed to by the PC have fully executed.

• No instruction beyond the one pointed to by the PC has been executed (That is no
prohibition on instruction beyond that in PC, it is just that any changes they make to
registers or memory must be undone before the interrupt happens).

• The execution state of the instruction pointed to by the PC is known.

An interrupt that does not meet these requirements is called an imprecise interrupt.

The phenomenon where the overall system performance is severely hindered by excessive
amounts of processing time spent handling interrupts is called an interrupt storm.

TYPES OF INTERRUPTS

Level-triggered

A level-triggered interrupt is a class of interrupts where the presence of an unserviced


interrupt is indicated by a high level (1), or low level (0), of the interrupt request line. A
device wishing to signal an interrupt drives the line to its active level, and then holds it at that
level until serviced. It ceases asserting the line when the CPU commands it to or otherwise
handles the condition that caused it to signal the interrupt.

41
Typically, the processor samples the interrupt input at predefined times during each bus cycle
such as state T2 for the Z80 microprocessor. If the interrupt isn't active when the processor
samples it, the CPU doesn't see it. One possible use for this type of interrupt is to minimize
spurious signals from a noisy interrupt line: a spurious pulse will often be so short that it is
not noticed.

Multiple devices may share a level-triggered interrupt line if they are designed to. The
interrupt line must have a pull-down or pull-up resistor so that when not actively driven it
settles to its inactive state. Devices actively assert the line to indicate an outstanding interrupt,
but let the line float (do not actively drive it) when not signalling an interrupt. The line is then
in its asserted state when any (one or more than one) of the sharing devices is signalling an
outstanding interrupt.

This class of interrupts is favored by some because of a convenient behavior when the line is
shared. Upon detecting assertion of the interrupt line, the CPU must search through the
devices sharing it until one requiring service is detected. After servicing this device, the CPU
may recheck the interrupt line status to determine whether any other devices also need
service. If the line is now desserted, the CPU avoids checking the remaining devices on the
line. Since some devices interrupt more frequently than others, and other device interrupts are
particularly expensive, a careful ordering of device checks is employed to increase efficiency.
There are also serious problems with sharing level-triggered interrupts. As long as any device
on the line has an outstanding request for service the line remains asserted, so it is not
possible to detect a change in the status of any other device. Deferring servicing a low-
priority device is not an option, because this would prevent detection of service requests from
higher-priority devices. If there is a device on the line that the CPU does not know how to
service, then any interrupt from that device permanently blocks all interrupts from the other
devices.

The original PCI standard mandated shareable level-triggered interrupts. The rationale for
this was the efficiency gain discussed above. (Newer versions of PCI allow, and PCI Express
requires, the use of message-signalled interrupts.)

42
Edge-triggered

An edge-triggered interrupt is a class of interrupts that are signalled by a level transition on


the interrupt line, either a falling edge (1 to 0) or a rising edge (0 to 1). A device wishing to
signal an interrupt drives a pulse onto the line and then releases the line to its quiescent state.
If the pulse is too short to be detected by polled I/O then special hardware may be required to
detect the edge.

Multiple devices may share an edge-triggered interrupt line if they are designed to. The
interrupt line must have a pull-down or pull-up resistor so that when not actively driven it
settles to one particular state. Devices signal an interrupt by briefly driving the line to its non-
default state, and let the line float (do not actively drive it) when not signalling an interrupt.
This type of connection is also referred to as open collector. The line then carries all the
pulses generated by all the devices. However, interrupt pulses from different devices may
merge if they occur close in time. To avoid losing interrupts the CPU must trigger on the
trailing edge of the pulse (e.g., the rising edge if the line is pulled up and driven low). After
detecting an interrupt the CPU must check all the devices for service requirements.

Edge-triggered interrupts do not suffer the problems that level-triggered interrupts have with
sharing. Service of a low-priority device can be postponed arbitrarily, and interrupts will
continue to be received from the high-priority devices that are being serviced. If there is a
device that the CPU does not know how to service, it may cause a spurious interrupt, or even
periodic spurious interrupts, but it does not interfere with the interrupt signalling of the other
devices. However, it is fairly easy for an edge triggered interrupt to be missed - for example
if interrupts have to be masked for a period - and unless there is some type of hardware latch
that records the event it is impossible to recover. Such problems caused many "lockups" in
early computer hardware because the processor didn't know it was expected to do something.
More modern hardware often has one or more interrupt status registers that latch the interrupt
requests; well written edge-driven interrupt software often checks such registers to ensure
events are not missed.

The elderly ISA bus uses edge-triggered interrupts, but does not mandate that devices be able
to share them. The parallel port also uses edge-triggered interrupts. Many older devices
assume that they have exclusive use of their interrupt line, making it electrically unsafe to

43
share them. However, ISA motherboards include pull-up resistors on the IRQ lines, so well-
behaved devices share ISA interrupts just fine.

Hybrid

Some systems use a hybrid of level-triggered and edge-triggered signalling. The hardware not
only looks for an edge, but it also verifies that the interrupt signal stays active for a certain
period of time.

A common use of a hybrid interrupt is for the NMI (non-maskable interrupt) input. Because
NMIs generally signal major – or even catastrophic – system events, a good implementation
of this signal tries to ensure that the interrupt is valid by verifying that it remains active for a
period of time. This 2-step approach helps to eliminate false interrupts from affecting the
system.

Message-signalled

A message-signalled interrupt does not use a physical interrupt line. Instead, a device signals
its request for service by sending a short message over some communications medium,
typically a computer bus. The message might be of a type reserved for interrupts, or it might
be of some pre-existing type such as a memory write.

Message-signalled interrupts behave very much like edge-triggered interrupts, in that the
interrupt is a momentary signal rather than a continuous condition. Interrupt-handling
software treats the two in much the same manner. Typically, multiple pending message-
signalled interrupts with the same message (the same virtual interrupt line) are allowed to
merge, just as closely-spaced edge-triggered interrupts can merge.

Message-signalled interrupt vectors can be shared, to the extent that the underlying
communication medium can be shared. No additional effort is required.

Because the identity of the interrupt is indicated by a pattern of data bits, not requiring a
separate physical conductor, many more distinct interrupts can be efficiently handled. This
reduces the need for sharing. Interrupt messages can also be passed over a serial bus, not
requiring any additional lines.

44
PCI Express, a serial computer bus, uses message-signalled interrupts exclusively.

Hardware

When a device asserts its interrupt request signal, it must be processed in an orderly fashion.
All CPUs, and many devices, have some mechanism for enabling/disabling interrupt
recognition and processing:

• At the device level, there is usually an interrupt control register with bits to enable or
disable the interrupts that device can generate.
• At the CPU level, a global mechanism functions to inhibit/enable (often called the
global interrupt enable) recognition of interrupts.
• Systems with multiple interrupt inputs provide the ability to mask (inhibit) interrupt
requests individually and/or on a priority basis. This capability may be built into the
CPU or provided by an external interrupt controller. Typically, there are one or more
interrupt mask registers, with individual bits allowing or inhibiting individual
interrupt sources.
• There is often also one non-maskable interrupt input to the CPU that is used to signal
important conditions such as pending power fail, reset button pressed, or watchdog
timer expiration.

Figure 1 shows an interrupt controller, two devices capable of producing interrupts, a


processor, and the interrupt-related paths among them. The interrupt controller
multiplexes multiple input requests into one output. It shows which inputs are active and
allows individual inputs to be masked. Alternatively, it prioritizes the inputs, shows the
highest active input, and provides a mask for inputs below a given level. The processor

45
status register has a global interrupt enable flag bit. In addition, a watchdog timer is
connected to the non-maskable interrupt input

The interrupt software associated with a specific device is known as its interrupt service
routine (ISR), or handler.

Software

Some older CPUs routed all interrupts to a single ISR. Upon recognizing an interrupt, the
CPU saved some state information and started execution at a fixed location. The ISR at that
location had to poll the devices in priority order to determine which one required service.
However, the basic process of interrupt handling is the same as in the more complex case.

Most modern CPUs use the same general mechanism for processing exceptions, traps, and
interrupts: an interrupt vector table. Some CPU vector tables contain only the address of the
code to be executed. In most cases, a specific ISR is responsible for servicing each
interrupting device and acknowledging, clearing, and rearming its interrupt; in some cases,
servicing the device (for example, reading data from a serial port) automatically clears and
rearms the interrupt.

46
Interrupts may occur at any time, but the CPU does not instantly recognize and process them
immediately. First, the CPU will not recognize a new interrupt while interrupts are disabled.
Second, the CPU must, upon recognition, stop fetching new instructions and complete those
still in progress. Because the interrupt is totally unrelated to the running program it interrupts,
the CPU and ISR work together to save and restore the full state of the interrupted program
(stack, flags, registers, and so on). The running program is not affected by the interruption,
although it takes longer to execute. The hardware and software flow for a timer interrupt is
shown in Figure 2.

Many interrupt controllers provide a means of prioritizing interrupt sources, so that, in the
event of multiple interrupts occurring at (approximately) the same time, the more time-
critical ones are processed first. These same systems usually also provide for prioritized
interrupt handling, a means by which a higher-priority interrupt can interrupt the processing
of a lower-priority interrupt. This is called interrupt nesting. In general, the ISR should only
take care of the time-critical portion of the processing, then, depending on the complexity of
the system, it may set a flag for the main loop, or use an operating system call to awaken a
task to perform the non-time-critical portion.

Interrupt Basics

47
This signal tells the
microprocessor that the
serial port chip needs
service

Serial
port

cpu

Network
Interface

This signal tells the Interrupt request pin


microprocessor that the
network chip needs
service

• IRQ – this pin lets the microprocessor know that some other chip in
the circuit wants help.
• Interrupt routine – subroutines that do whatever needs to be done
when the interrupt signal occurs.
• An interrupt routine is sometimes called an interrupt handler(ISR).

48
Task Code

�MOVE R1, (iCentigrade)


MULTIPY R1,9
DIVIDE R1, 5
ADD R1, 32
MOVE (iFarnht), R1
JCOND ZERO, 109A1
JUMP 14403
MOVE R5, 23
PUSH R5
CALL Skiddo
POP R9 PUSH R1
MOVE (Answer), R1 PUHS R2
RETURN …
!!! Read char from
hw into R1
!! Store R1 value
into memory

!! Reset serial port
hw
!! Reset interrupt
Interrupt Basics – Saving and Restoring the Context

• Must write your interrupt service routines to push and pop all of the
registers they use, since you have no way of knowing what registers
will have important values in them when the interrupt occurs.
• Pushing all of the registers at the beginning of an interrupt routine is
known as saving the context
• At the end poping them is restoring the contxt.

Interrupt Basics – Disabling Interrupts


• Almost every system allows you to disable interrupts, usually in a
variety of ways.
• Most microprocessors have a nonmaskable interrupt.

10. The Shared-Data Problem


• One problem that arises as soon as you use injterrupts is that your
interrupt routines need to communicate with the rest of your code.
• Neither possible nor desirable to do all its work in interrupt routines.
• For this , the interrupt routines and the task code must share one or
more variables that they can use to comomunicate with one
another.

49
• Figure 4.4
Static int iTemperatures[2];

Void interrupt vReadTemperatures(void)


{
– iTemperatures[0] = !! Read in value from hardware
– iTemperatures[1] = !! Read in value from hardware
}

Void main(void)
{
int iTemp0, iTemp1;
while(TRUE)
{
iTemp0 = iTemperatures[0];
iTemp1 = iTemperatures[1];
if(iTemp0 != iTemp1)
!! Set off howling alarm;
}
}

• Figure 4.5
Static int iTemperatures[2];
Void interrupt vReadTemperatures(void)
{
– iTemperature[0] = !! Read in value from hardware
– iTemperature[1] = !! Read in value from
– Hardware
}
Void main(void)
{
while(TRUE)
{
if(iTemperatures[0] != iTemperatures[1])
!!Set off howling alarm;
}
}

• Figure 4.6 Assembly Language Equivalent of Figure 4.5

– MOVE R1,(iTemperatures[0])
– MOVE R2,(iTemperatures[1])
– SUBTRACT R1,R2
– JCOND ZERO, TEMPERATURES_OK
– .
– .
– .

50
– ;Code goes here to set off the alarm
– .
– .
TEMPERATURES_OK:
.
.
.

Characteristics of the Shared-Data Bug

• The problem with the code in Figure 4.4 and in Figure 4.5 is that the
iTemperatures array is shared between the interrupt routine and the
task code.
• If the interrupt just happens to occur while the main routine is using
iTemperatures, then the bug shows itself. This bug are difficult to
find, because they do not happen every time the code runs.

Sloving the Shared-Data Problem

• The first method of solving the shared-data problem is to disable


interrupts whenever your task code the shared data.

• Figure 4.7
• Static int iTemperatures[2];
• Void interrupt vReadTemperatures(void)
• {
– iTemperatures[0] = !! Read in value from hardware
– iTemperatures[1] = !! Read in value form
– Hardware
}
Void main(void)
{
int iTemp0, iTemp1;
while(TRUE)
{
disable(); /* Disable interrupts while we use the array*/
iTemp0 = iTemperatures[0];
iTemp1 = iTemperatures[1];
enable();
if(iTemp0 != iTemp1)
!!Set off howling alarm;
}
}

• Figure4.8
.

51
.
DI ; disable interrupts while we use the array
MOVE R1, (iTemperature[0])
MOVE R2, (iTemperature[1])
EI ; enable interrupt again

SUBTRACT R1,R2
JCOND ZERO, TEMPERATURES_OK
.
.
; Code goes here to set off the alarm
.
.
TEMPERATURES_OK:
.
.

“Atomic” and “Critical Section”

• A part of a program is said to be atomic if it cannot be interrupted.


• A Set of instructions that must be atomic for the system to work
properly is often called a critical section.

A Few More Examples

Static int iSeconds, iMinutes, iHours;


Void interrupt vUpdateTime(void)
{
++iSeconds;
If(iSeconds >= 60)
{
iSeconds = 0;
++iMinutes;
If(iMinutes >= 60)
{
iMinutes =0;
++iHours;
If( iHours >= 24)
iHours=0;
}
}
}
long lSecondsSinceMidnight(void)
{
return ((((iHours*60) + iMinutes)*60) + iSeconds);
}

52
• One way to fix this program

long lSecondsSinceMidnight(void)
{
disable();
return ((((iHours*60) + iMinutes)*60) + iSeconds);
enable(); /* WRONG: This never gets executed */
}

long lSecondsSinceMidnight(void)
{
long lReturnVal;
disable();
lReturnVal = (((iHours*60) + iMinutes)*60) + iSeconds;
enable();
return lReturnVal;
}

long lSecondsSinceMidnight(void)
{
long lReturnVal;
BOOL fInterruptStateOld;
fInterruptStateOld = disable();

lReturnVal = (((iHours*60) + iMinutes)*60) + iSeconds;


if( fInterruptStateOld)
enable();
return lReturnVal;
}

Another Potential Solution

Static long int lSecondToday;


Void interrupt vUpdateTime(void)
{
.
.
.
++lSecondsToday;
If(lSecondsToday == 60*60*24)
lSecondsToday = 0L;
.
.
.
}

53
Long lSecondsSinceMidnight(void)
{
return(lSecondsToday);
}

• If the microprocessor’s registers are too small to hold a long integer,


then the assembly language will be something like:
– MOVE R1, (lSecondsToday) : Get first byte or word
– MOVE R2, (lSecondsToday + 1): Get second byte or word
– .
– .
– .
– RETURN

The Volatile Keyword

Static long int lSecondsToday;


Void interrupt vUpdateTime(void)

{
.
++lSecondsToday;
If( lSecondsToday == 60L*60L*24)
lSecondsToday=0L;
.
.
.
}
Long lSecondsSinceMidnight(void)
{
Long lReturn;
lReturn = lSecondsToday;
While(lReturn!=lSecondsToday)
lReturn = lSecondsToday;
Return (lReturn);
}

• Most compilers assume that a value stays in memory unless the


program changes it, and they use that assumption for optimization.
• With the volatile keyword in the declaration the compiler knows that
the microprocessor must read the value of lSecondsToday from
memory every time it is referenced. The compiler is not allowed to
optimize reads or writes of a variable out of existence.

11. INTERRUPT LATENCY

54
• Because interrupts are a tool for getting better response from our
systems, and because the speed with which an embedded system
can respond is always of interest.
• The speed depends upon a number of factors.

• 1. The longest period of time during which that interrupt is(or all
interrupts are)disabled.
• 2. The period of time it takes to execute any interrupt routines for
interrupts that are of higher priority than the one in question.
• 3.how long it takes the microprocessor to stop what it is doing, do
the necessary bookkeeping , and start executing instructions within
the interrupt routine
• 4. How long It takes the interrupt routine to save the context and
then do enough work that what it has accomplished counts as a
“response”

Make Your Interrupt Routines Short

• The four factors mentioned above control interrupt latency and


reponse.
• You deal with factor 4 by writing efficient code.
• Factor 2 is one of the reasons that it is generally a good idea to
write short interrupt routines.
• Factor 3 is not under software control.

Disabling Interrupts

The practice of disabling interrupts contributes to interrupt latency

INTERRUPT LATENCY

Interrupt latency is the time between the generation of an interrupt by a device and the
servicing of the device which generated the interrupt. For many operating systems, devices
are serviced as soon as the device's interrupt handler is executed. Interrupt latency may be
affected by interrupt controllers, interrupt masking, and the operating system's (OS) interrupt
handling methods.

There is usually a tradeoff between interrupt latency, throughput, and processor utilization.
Many of the techniques of CPU and OS design that improve interrupt latency will decrease
throughput and increase processor utilization. Techniques that increase throughput may

55
increase interrupt latency and increase processor utilization. Lastly, trying to reduce
processor utilization may increase interrupt latency and decrease throughput.

Minimum interrupt latency is largely determined by the interrupt controller circuit and its
configuration. They can also affect the jitter in the interrupt latency, which can drastically
affect the real-time schedulability of the system. The Intel APIC Architecture is well known
for producing a huge amount of interrupt latency jitter.

Maximum interrupt latency is largely determined by the methods an OS uses for interrupt
handling. For example, most processors allow programs to disable interrupts, putting off the
execution of interrupt handlers, in order to protect critical sections of code. During the
execution of such a critical section, all interrupt handlers that cannot execute safely within a
critical section are blocked (they save the minimum amount of information required to restart
the interrupt handler after all critical sections have exited). So the interrupt latency for a
blocked interrupt is extended to the end of the critical section, plus any interrupts with equal
and higher priority that arrived while the block was in place.

Many computer systems require low interrupt latencies, especially embedded systems that
need to control machinery in real-time. Sometimes these systems use a real-time operating
system (RTOS). A RTOS makes the promise that no more than an agreed upon maximum
amount of time will pass between executions of subroutines. In order to do this, the RTOS
must also guarantee that interrupt latency will never exceed a predefined maximum.

There are many methods that hardware may use to increase the interrupt latency that can be
tolerated. These include buffers, and flow control. For example, most network cards
implement transmit and receive ring buffers, interrupt rate limiting, and hardware flow
control. Buffers allow data to be stored until it can be transferred, and flow control allows the
network card to pause communications without having to discard data if the buffer is full.
Considerations
Modern hardware also implements interrupt rate limiting. This helps prevent interrupt storms
or live lock by having the hardware wait a programmable minimum amount of time between
each interrupt it generates. Interrupt rate limiting reduces the amount of time spent servicing
interrupts, allowing the processor to spend more time doing useful work. Exceeding this time
results in a soft (recoverable) or hard (non-recoverable) error.

56
• Interrupt latency is amount of time it takes a system to respond to
an interrupt; above factors is included .
Worst Case Interrupt Latency

Processor gets to ISR does


interprocessor ISR Critical work .
Task Code
Disable
interrupts .

ISR
250 usec

Interprocessor
300 usec
Interrupt
Occurs .

Time to deadline : 625 usec

worst Interrupt Latency

Processor gets to
Interprocessor ISR . ISR does
Processor gets to Critical work .
Network ISR .
Task Code
Disables Interrupts .

Network
Interrupt
Occurs .

ISR

250 usec
Interprocessor
100 usec
Interrupt
Alternatives
Occurs . to Disabling Interrupts 300 usec

Time to deadline : 625 usec

• Since disabling interrupts increase interrupt latency, you should


know a few alternative methods for dealing with shared data.
• Because in most cases simply disabling interrupts is more robust
than the techniques discussed below , you should use them only for
those dire situations in which you can’t afford the added latency.

57
Avoiding Disalbling Interrupts

• Static int iTemperaturesA[2];


• Static int iTemperaturesB[2];
• Static int iTemperaturesC[2];

• Void interrupt vReadTemperature(void)


• {
– If(fTaskCodeUsingTempsB)
– {
– iTemperaturesA[0]=!! Read in value from hardward;
– iTemperaturesA[1]=!!Read in value from hardward;
– }
– Else
– {
– iTemperaturesB[0]=!! Read in value from hardward;
– iTemperaturesB[1]=!!Read in value from hardward;
– }
• }

• Void main()
• {
• While(TRUE)
• {
– If(fTaskCodeUsingTempsB)
If(iTemperaturesB[0] != iTemperaturesB[1])
!!Set off howling alarm;
Else
– If(fTaskCodeUsingTempsB)
If(iTemperaturesB[0] != iTemperaturesB[1])
!!Set off howling alarm;
fTaskCodeUsingTempB=! fTaskCodeUsingTempsB;
}
}

A Circular Queue Without Disabling Interrupts

#define QUEUE_SIZE 100


int iTemperatureQueue[QUEUE_SIZE];
int iHead=0;
int iTail=0;

void interrupt vReadTemperatures()


{
/*if the queue is not full...*/
if( !((iHead+2 == iTail) || (iHead==QUEUE_SIZE-2 && iTail==0)))
{

58
iTemperatureQueue[iHead] =!! read one temperature;
iTemperatureQueue[iHead+1] = !! read one temperature;
iHead+=2;
if(iHead == QUEUE_SIZE)
iHead = 0;
}
else
!! throw away next value
}
void main()
{
int iTemperature1,iTemperature2;

while(TRUE)
{
/* If there is any data...*/
if(iTail != iHead)
{
iTemperature1 = iTemperatureQueue[iTail];
iTemperature2 = iTemperatureQueue[iTail+1];
iTail +=2;
if(iTail == QUEUE_SIZE)
iTail = 0;
!! Do something with iValue;
}
}
}

59

You might also like