You are on page 1of 221

PREFACE

This book is useful guide for everyone who uses the Philips LPC2148 Microcontrollers in a new design. It can be
used as both as a reference book and as a tutorial. Here it is treated that the reader has some basic knowledge in
programming of microcontrollers for embedded systems and are well verse with the C language. The important
technical information is covered in the first four chapters, and it should be read in order if you are completely new
to the ARM7 CPU and the LPC2148.

The chapter 1 gives an introduction to the major features of the ARM7 CPU. This Chapter will give you enough
understanding about the programming any ARM7 device. In this first chapter we discuss how the RISC (Reduced
Instruction Set Computer) design philosophy was adapted by ARM to create a flexible embedded processor. We
then introduce an example embedded device and discuss the typical hardware and software technologies that
surround an ARM processor. The final section introduces the architecture implementations by subdividing them
into specific ARM processor core families.

Chapter 2 covers embedded systems with focus on the actual processor itself. First, we will provide an overview
of the processor core and describe how data transfers between its different parts. We will describe the
programmer's model from a software developer's view of the ARM processor, which will show you the functions
of the Data flow model of ARM core. We will also take a look at the core extensions that form an ARM processor.
Core extensions speed up and organize main memory as well as extend the instruction set. We will then cover the
register file of ARM, know the function of LR, PC and SP registers, know the structure of CPSR register. The
function of conditional flag bits, interrupt mask bits and mode bit. A brief introduction into the different processor
modes. - Banked registers Explanation of 3-staged pipelining of ARM-7 with an example. A brief introduction
to exceptions, interrupts and the vector table Action on entering and leaving an exception.

Chapter 3 this is fundamental chapter gives information about ARM instruction set. Consequently, it is placed
here before we start going into any depth on optimization and efficient algorithms. This chapter introduces the
most common and useful ARM instructions and builds on the ARM processor fundamentals covered in the last
chapter. Since all our programming examples are written in C there is no need to be an expert ARM7 assembly
programmer. However an understanding of the underlying machine code is very important in developing efficient
programs.

Chapter 4 gives the overview of the THUMB instructions. Explains the register usage, interworking of
ARM-THUMB. We will also study about THUMB instructions.

Chapter 5 gives basic rules we need to follow while programming ARM7 in ‘C’. We discuss data types, symbols,
declaring variables, defining labels, Unary operators, Binary operators, Addition, subtraction, Shift operators,
Relational operators ,Boolean operates and logical operators which widely used while programming.

Chapter 6 this chapter gives General structure of an assembly language, Meaning of AREA, ENTRY directives.
Explains how Subroutines can be used with different methods of passing parameters with example programs,
ARM APCS specification, Exception handling, ARM processor exceptions and modes, Vector table, Exception
priorities, link register offsets. Interrupts, standard design practices in assigning interrupts, interrupt latency, IRQ
and FIQ exceptions, Enabling and disabling of FIQ and IRQ exceptions. Basic interrupt stack design and
implementation- examples for stack implementation in user, supervisor and IRQ modes. Listing of different
interrupt handling schemes, process of writing a non nested interrupt handler using inline assembler in C, inline
assembly syntax, restrictions on inline assembly operations. Embedded assembler, embedded assembly syntax,
Restrictions on embedded assembly operations, calling between C and assembly. Method of writing Interrupt
service routines in C with examples.

Chapter 7 this chapter gives the features of LPC2148 microcontroller. Explains the block diagram of LPC 2148
microcontroller and the function of pins of LPC 2148 microcontroller. We then study the features of on chip
program memory and on chip static RAM, the memory map, the functional features of Interrupt controller, pin
connect block, DAC,ADC ,USB controller,UART,I2C,SPI,SSP controllers, general purpose timers, Watch dog
timers, RTC& Pulse width modulator(PWM). Study the feature of system control units like PLL, Brown out
detector, reset and wake up timer, code security, External interrupt inputs, Memory mapping control, Power
control, and APB bus
Chapter 8 this chapter gives a fair idea about peripherals of LPC 2148 controller. Here we will study about
purpose of Pin Connect Block GPIO: Features, Application, Pin description, Register description, examples.PLL:
Introduction, Register description, about PLL frequency. Calculation, procedure for determining PLL settings,
SPI: block diagram of SPI solution. Timers: Architecture of timer module, Register description, and examples.
PWM: Introduction, register description, rules for single/double edge controlled PWM outputs.RTC:
Introduction, architecture, register description, RTC interrupts, usage, Prescaler-Examples of prescaler
usage.ADC: Pin description, register description, operation. DAC: Pin description, register description, operation

Chapter 9 this chapter explains the interfacing of some of the real time devices such as LCD, Seven segment
Display, Stepper Motor, Dc Motor, Relay and HEX keypad with LPC2148.We explain the basic working
principle and construction of these devices along with the interfacing program. All the program is in “C”
Chapter 1:
Introduction to ARM EMBEDDED
SYSTEM
1.1History of ARM processor design

After achieving some success with the BBC Micro computer, Acorn Computers Ltd considered how to
move on from the relatively simple MOS Technology 6502 processor to address business markets like the one
that would soon be dominated by the IBM PC, launched in 1981. The Acorn Business Computer (ABC) plan
required a number of second processors to be made to work with the BBC Micro platform, but processors such as
the Motorola 68000 and National Semiconductor 32016 were unsuitable, and the 6502 was not powerful enough
for a graphics based user interface.

Acorn would need a new architecture, having tested all of the available processors and found them
wanting. Acorn then seriously considered designing its own processor, and their engineers came across papers on
the Berkeley RISC project. They felt it showed that if a class of graduate students could create a competitive
32-bit processor, then Acorn would have no problem. A trip to the Western Design Center in Phoenix, where the
6502 was being updated by what was effectively a single-person company, showed Acorn engineers Steve Furber
and Sophie Wilson that they did not need massive resources and state-of-the-art R&D facilities.

Fig1.1 A Conexant ARM processor used mainly in routers

The official “Advanced RISC Machine (ARM)” project started in October 1983. VLSI Technology, Inc
was chosen as silicon partner, since it already supplied Acorn with ROMs and some custom chips. The design was
led by Wilson and Furber, with a key design goal of achieving low-latency input/output (interrupt) handling like
the 6502. The 6502's memory access architecture had allowed developers to produce fast machines without the
use of costly direct memory access hardware. VLSI produced the first ARM silicon on 26 April 1985 – it worked
first time and came to be known as ARM1 by April 1985. The first "real" production systems named ARM2 were
available in the following year.

Its first practical application was as a second processor to the BBC Micro, where it was used to develop
the simulation software to finish work on the support chips (VIDC, IOC, MEMC) and to speed up the operation of
the CAD software used in developing ARM2. Wilson subsequently coded BBC Basic in ARM assembly
language, and the in-depth knowledge obtained from designing the instruction set allowed the code to be very
dense, making ARM BBC Basic an extremely good test for any ARM emulator. The original aim of a principally
ARM-based computer was achieved in 1987 with the release of the Acorn Archimedes.

Such was the secrecy surrounding the ARM CPU project that when Olivetti were negotiating to take a
controlling share of Acorn in 1985, they were not told about the development team until after the negotiations had
been finalized. In 1992 Acorn once more won the Queen's Award for Technology for the ARM.
The ARM2 featured a 32-bit data bus, a 26-bit address space and sixteen 32-bit registers. Program code
had to lie within the first 64 Mbyte of the memory, as the program counter was limited to 26 bits because the top
4 and bottom 2 bits of the 32-bit register served as status flags. The ARM2 was possibly the simplest useful 32-bit
microprocessor in the world, with only 30,000 transistors (compare the transistor count with Motorola's six-year
older 68000 model which was aptly named, since it contained 68,000 transistors). Much of this simplicity comes
from not having microcode (which represents about one-quarter to one-third of the 68000) and, like most CPUs of
the day, not including any cache. This simplicity led to its low power usage, while performing better than the Intel
80286. A successor, ARM3, was produced with a 4 KB cache, which further improved performance.

Apple, DEC, Intel: ARM6, StrongARM, XScale

In the late 1980s Apple Computer and VLSI Technology started working with Acorn on newer versions
of the ARM core. The work was so important that Acorn spun off the design team in 1990 into a new company
called Advanced RISC Machines Ltd. Advanced RISC Machines became ARM Ltd when its parent company,
ARM Holdings plc, floated on the London Stock Exchange and NASDAQ in 1998.

The new Apple-ARM work would eventually turn into the ARM6, first released in early 1992. Apple
used the ARM6-based ARM 610 as the basis for their Apple Newton PDA. In 1994, Acorn used the ARM 610 as
the main CPU in their RISC PC computers. DEC licensed the ARM6 architecture and produced the Strong ARM.
At 233 MHz this CPU drew only 1 Watt of power (more recent versions draw far less). This work was later passed
to Intel as a part of a lawsuit settlement, and Intel took the opportunity to supplement their aging i960 line with the
Strong ARM. Intel later developed its own high performance implementation known as X-Scale which it has
since sold to Marvell.

The ARM architectures used in smart phones, personal digital assistants and other mobile devices range
from ARMv5, in obsolete/low-end devices, to the ARM M-series, in current high-end devices. X-Scale and
ARM926 processors are ARMv5TE, and are now more numerous in high-end devices than the Strong ARM,
ARM9TDMI and ARM7TDMI based ARMv4 processors, but lower-end devices may use older cores with lower
licensing costs. ARMv6 processors represented a step up in performance from standard ARMv5 cores, and are
used in some cases, but Cortex processors (ARMv7) now provide faster and more power-efficient options than all
those previous generations. Cortex-A targets applications processors, as needed by smart phones that previously
used ARM9 or ARM11. Cortex-R targets real-time applications, and Cortex-M targets microcontrollers.

In 2009, some manufacturers introduced notebooks based on ARM architecture CPUs, in direct
competition with notebooks based on Intel Atom.

1.2 ARM cores

ARM provides a summary of the numerous vendors who implement ARM cores in their design. KEIL
also provides a somewhat newer summary of vendors of ARM based processors. ARM further provides a chart
displaying an overview of the ARM processor lineup with performance and functionality versus capabilities for
the more recent ARM7, ARM9, ARM11, Cortex-M, Cortex-R and Cortex-A device families.

Typical MIPS
ARM Family ARM Architecture ARM Core Feature Cache (I/D), MMU
@ MHz
ARM1 ARMv1 ARM1 First implementation None
ARMv2 added the MUL 4 MIPS @ 8 MHz
ARMv2 ARM2 None
(multiply) instruction 0.33 DMIPS/MHz
ARM2 Integrated MEMC (MMU),
Graphics and IO processor. None,
ARMv2a ARM250 7 MIPS @ 12 MHz
ARMv2a added the SWP and MEMC1a
SWPB (swap) instructions.
12 MIPS @ 25 MHz
ARM3 ARMv2a ARM3 First integrated memory cache. 4 KB unified
0.50 DMIPS/MHz
ARMv3 first to support 32-bit
ARM6 ARMv3 ARM60 memory address space None 10 MIPS @ 12 MHz
(previously 26-bit)
As ARM60, cache and coprocessor 28 MIPS @
ARM600 4 KB unified
bus (for FPA10 floating-point unit). 33 MHz
As ARM60, cache, no coprocessor 17 MIPS @ 20 MHz
ARM610 4 KB unified
bus. 0.65 DMIPS/MHz
ARM700 8 KB unified 40 MHz
ARM710 As ARM700, no coprocessor bus. 8 KB unified 40 MHz
ARM7 ARMv3
40 MHz
ARM710a As ARM710 8 KB unified
0.68 DMIPS/MHz
15 MIPS @
ARM7TDMI 16.8 MHz
3-stage pipeline, Thumb None
(-S) 63 DMIPS @
70 MHz
ARM7 ARM710T As ARM7TDMI, cache 8 KB unified, MMU 36 MIPS @ 40 MHz
ARMv4T
TDMI
8 KB unified, MMU
60 MIPS @
ARM720T As ARM7TDMI, cache with Fast Context
59.8 MHz
Switch Extension
ARM740T As ARM7TDMI, cache MPU
5-stage pipeline, Thumb, Jazelle
ARM7EJ ARMv5TEJ ARM7EJ-S None
DBX, Enhanced DSP instructions
5-stage pipeline, static branch
84 MIPS @ 72 MHz
ARM8 ARMv4 ARM810[16] prediction, double-bandwidth 8 KB unified, MMU
1.16 DMIPS/MHz
memory
Strong 16 KB/8–16 KB, 203–206 MHz
ARMv4 SA-1 5-stage pipeline
ARM MMU 1.0 DMIPS/MHz
ARM9TDMI 5-stage pipeline, Thumb none
16 KB/16 KB, MMU
with FCSE (Fast 200 MIPS @
ARM9TD ARM920T As ARM9TDMI, cache
ARMv4T Context Switch 180 MHz
MI Extension)
ARM922T As ARM9TDMI, caches 8 KB/8 KB, MMU
ARM940T As ARM9TDMI, caches 4 KB/4 KB, MPU
variable, tightly
Thumb, Enhanced DSP
ARM946E-S coupled memories,
instructions, caches
MPU
ARMv5TE
Thumb, Enhanced DSP
ARM966E-S no cache, TCMs
instructions
ARM9E
ARM968E-S As ARM966E-S no cache, TCMs
Thumb, Jazelle DBX, Enhanced variable, TCMs, 220 MIPS @
ARMv5TEJ ARM926EJ-S
DSP instructions MMU 200 MHz
Clockless processor, as no caches, TCMs,
ARMv5TE ARM996HS
ARM966E-S MPU
6-stage pipeline, Thumb, Enhanced
ARM1020E 32 KB/32 KB, MMU
ARMv5TE DSP instructions, (VFP)
ARM10E ARM1022E As ARM1020E 16 KB/16 KB, MMU
Thumb, Jazelle DBX, Enhanced variable, MMU or
ARMv5TEJ ARM1026E J-S
DSP instructions, (VFP) MPU
7-stage pipeline, Thumb, Enhanced
XScale 32 KB/32 KB, MMU 133–400 MHz
DSP instructions
Wireless MMX, Wireless
Bulverde 32 KB/32 KB, MMU 312–624 MHz
XScale ARMv5TE SpeedStep added
32 KB/32 KB (L1),
Monahans Wireless MMX2 added optional L2 cache up up to 1.25 GHz
to 512 KB, MMU
740 @
8-stage pipeline, SIMD, Thumb,
532–665 MHz
ARMv6 ARM1136J (F)-S Jazelle DBX, (VFP), Enhanced variable, MMU
(i.MX31 SoC),
DSP instructions
400–528 MHz
ARM1156T2 8-stage pipeline, SIMD, Thumb-2,
ARMv6T2 variable, MPU
(F)-S (VFP), Enhanced DSP instructions
ARM11
965 DMIPS @
ARM1176JZ variable, MMU + 772 MHz, up to 2
ARMv6ZK As ARM1136EJ(F)-S
(F)-S Trust Zone 600 DMIPS with
four processors[20]
As ARM1136EJ(F)-S, 1–4 core
ARMv6K ARM11 MPCore variable, MMU
SMP

1.3 Special features of ARM processor design.


The principle feature of the ARM 7 microcontroller is that it is a register based load-and-store architecture
with a number of operating modes. While the ARM7 is a 32 bit microcontroller, it is also capable of running a
16-bit instruction set, known as “THUMB”. This helps it achieve a greater code density and enhanced power
saving. While all of the register-to-register data processing instructions are single-cycle, other instructions such as
data transfer instructions, are multi-cycle. To increase the performance of these instructions, the ARM 7 has a
three-stage pipeline. Due to the inherent simplicity of the design and low gate count, ARM 7 is the industry leader
in low-power processing on a watts per MIP basis. Finally, to assist the developer, the ARM core has a built-in
JTAG debug port and on-chip “embedded ICE” that allows programs to be downloaded and fully debugged
in-system.
ARM processors are typical of RISC (Reduced Instruction Set Computers) processors in that they
implement a load and store architecture. Only load and store instructions can access memory. Data processing
instructions operate on register contents only. The RISC philosophy is implemented with four major design rules:
Instructions—
RISC processors belongs to reduced number of instruction set. These instructions provide simple
operations that can each execute in a single cycle. The compiler or programmer performs complicated operations
(for example, a divide operation) by combining several simple instructions. Each instruction is a fixed length to
allow the pipeline to fetch future instructions before decoding the current instruction. But in CISC processors the
instructions are often of variable size and take many cycles to execute.
Pipelines—
The instructions processing is divided into number of smaller units that can be executed in parallel by
using pipelines structure. The instruction pipeline has three stages; FETCH, DECODE and EXECUTE. The
hardware of each stage is designed independently so that up to three instructions can be processed simultaneously.
This will increase the execution speed of sequential code. However a branch instruction will cause the pipeline to
be flushed marring its performance.

Fig1.2 ARM pipe line structure


Registers—
RISC machines have a large general-purpose register set. Here any register can contain either data or an
address. Registers act as the fast local memory store for all data processing operations. But CISC processors have
Specific registers for specific purposes,
Load-store architecture—
The processor operates on data held in registers. Separate load and store instructions transfer data
between the register bank and external memory. Memory accesses are costly, so separating memory accesses
from data processing provides an advantage because you can use data items held in the register bank multiple
times without needing multiple memory accesses. In contrast, with a CISC design the data processing operations
can act on memory directly.
In order to keep the ARM 7 both simple and cost-effective, the code and data regions are accessed via a
single data bus. Thus while the ARM 7 is capable of single-cycle execution of all data processing instructions,
data transfer instructions may take several cycles since they will require at least two accesses onto the bus (one for
the instruction one for the data). In order to improve performance, a three stage pipeline is used that allows
multiple instructions to be processed simultaneously. The following list gives some the silent features of ARM7
processor
• 32-bit RISC-processor core (32-bit instructions)
• 37 pieces of 32-bit integer registers (16 available)
• 3 staged Pipeline (ARM7)
• Cached (depending on the implementation)
• Von Neumann-type bus structure (ARM7), Harvard (ARM9)
• 8 / 16 / 32 -bit data types
• 7 modes of operation (User, FIQ, IRQ, Supervisory, Abort, System, Undefined)
• Simple structure ,reasonably good speed / power consumption ratio
• Fully 32-bit instruction set in native operating mode
• 32-bit long instruction word
• All instructions are conditional
• Normal execution with Always(AL)condition
• For a RISC-processor, the instruction set is quite diverse with different addressing modes
• 36 instruction formats
• In conditional operations one of the 14 available conditions is selected

1.4 Structure of embedded device based on ARM


Embedded systems can control many different devices, from small sensors found on a production line, to
the real-time control systems used on a Space craft’s. All these devices use a combination of software and
hardware components. Each component is chosen for efficiency and, if applicable, is designed for future
extension and expansion.

Flash/ROM
ARM Memory Controller

Cache Program
ARM7 TDMI CORE RAM

EEPROM
Data

ARM7 TDMI BUS


Interface SRAM
Workspace
Bus Arbiter/ External
Memorie
Master Signals Manager
s
ARM7 TDMI CORE
External
Other MCU AMBA Bridge
MCUS
Embedded Cores

External Advanced USART Tx/RX


Interrupt Controller
Peripheral Data

Interrupts Core
Controller
Parallel I/O

Timer SPI MOSI/MOSO


Counter Timer/Counter Core
Parallel I/O

IO
RTC Real-time Clock
Interrupt
Customer Customer IP
IO Watchdog Timer Watch Dog
Over Flow

DPR DSP DSP


AM Cor I/O
Figure 1.3 An ARM-based embedded device, a microcontroller. FIF e Core
ADC,DAC O
Analog I/o
ADC, DAC, CODEC
Figure 1.3 shows a typical embedded device based on an ARM core. Each box represents a feature or
function. The lines connecting the boxes are the buses carrying data. We can separate the device into four main
hardware components:
The Embedded device is controlled by ARM processor. Different versions of the ARM processor are
available to suit the desired operating characteristics. An ARM processor comprises a core (the execution engine
that processes instructions and manipulates data) plus the surrounding components that interface it with a bus.
These components can include memory management and caches.
Controllers coordinate important functional blocks of the system. Two commonly found controllers are
interrupt and memory controllers.
The peripherals provide all the input-output capability external to the chip and are responsible for the
uniqueness of the embedded device.
A bus is used to communicate between different parts of the device.

1.5 ARM BUS Terminology.


Embedded systems use different bus technologies. The most common PC bus technology, the Peripheral
Component Interconnect (PCI) bus, which connects devices such as video cards and hard disk controllers to the
x86 processor bus. This type of technology is external or off-chip (i.e., the bus is designed to connect
mechanically and electrically to devices external to the chip) and is built into the motherboard of a PC.
But in embedded devices we use an on-chip bus that is internal to the chip and that allows different
peripheral devices to be interconnected with an Core.
There are two different classes of devices attached to the bus. The ARM processor core is a bus master—a
logical device capable of initiating a data transfer with another device across the same bus. Peripherals tend to be
bus slaves—logical devices capable only of responding to a transfer request from a bus master device.
A bus has two architecture levels. The first is a physical level that covers the electrical characteristics and
bus width (16, 32, or 64 bits). The second level deals with protocol—the logical rules that govern the
communication between the processor and a peripheral.
ARM is primarily a design company. It seldom implements the electrical characteristics of the bus, but it routinely
specifies the bus protocol.

1.6 Overview of the AMBA Bus:


The “Advanced Microcontroller Bus Architecture (AMBA)” specification defines an on chip
communications standard for designing high-performance embedded microcontrollers.

High-Performance High-Bandwidth
ARM Processor On-chip RAM
UART Timer
B
R
AHB or ASB
High-Bandwidth I APB
External memory
D
Interface
G
E
DMA bus
Master Keypa PIO
d

Fig1.4 A typical AMBA-based microcontroller


An AMBA-based microcontroller typically consists of a high-performance system backbone bus
(AMBA AHB or AMBA ASB), able to sustain the external memory bandwidth, on which the CPU, on-chip
memory and other Direct Memory Access (DMA) devices reside. This bus provides a high-bandwidth interface
between the elements that are involved in the majority of transfers. Also located on the high-performance bus is
a bridge to the lower bandwidth APB, where most of the peripheral devices in the system are located (see Figure
1-3).
Three distinct buses are defined within the AMBA specification:
• The Advanced High-performance Bus (AHB)
• The Advanced System Bus (ASB)
• The Advanced Peripheral Bus (APB).

1.6.1 Objectives of the AMBA specification.


The AMBA specification has been derived to satisfy four key requirements:
 To facilitate the right-first-time development of embedded microcontroller products with one or more
CPUs or signal processors.
 To be technology-independent and ensure that highly reusable peripheral and system macro cells can
be migrated across a diverse range of IC processes and be appropriate for full-custom, standard cell and
gate array technologies.
 To encourage modular system design to improve processor independence, providing a development
road-map for advanced cached CPU cores and the development of peripheral libraries
 To minimize the silicon infrastructure required to support efficient on-chip and off-chip
communication for both operation and manufacturing test.
1.6.2 The Advanced High-performance Bus (AHB):
The AMBA AHB is for high-performance, high clock frequency system modules.The AHB acts as the
high-performance system backbone bus. AHB supports the efficient connection of processors, on-chip memories
and off-chip external memory interfaces with low-power peripheral macro cell functions. AHB is also specified to
ensure ease of use in an efficient design flow using synthesis and automated test techniques.
1.6.3 The Advanced System Bus (ASB):
The AMBA ASB is for high-performance system modules. AMBA ASB is an alternative system bus
suitable for use where the high-performance features of AHB are not required. ASB also supports the efficient
connection of processors, on-chip memories and off-chip external memory interfaces with low-power peripheral
macro cell functions.
1.6.4 The Advanced Peripheral Bus (APB):
The AMBA APB is for low-power peripherals. AMBA APB is optimized for minimal power
consumption and reduced interface complexity to support peripheral functions. APB can be used in conjunction
with either version of the system bus.

1.7 AMBA AHB PROTOCOL


AHB is a new generation of AMBA bus which is intended to address the requirements of high
performance synthesizable designs. It is a high-performance system bus that supports multiple bus masters and
provides high-bandwidth operation. AMBA AHB implements the features required for high-performance, high
clock frequency systems including:
• Burst transfers
• Split transactions
• Single-cycle bus master handover
• Single-clock edge operation
• Non-tristate implementation
• Wider data bus configurations (64/128 bits).

Bridging between this higher level of bus and the current ASB/APB can be done efficiently to ensure that
any existing designs can be easily integrated. An AMBA AHB design may contain one or more bus masters,
typically a system would contain at least the processor and test interface. However, it would also be common for
a Direct Memory Access (DMA) or Digital Signal Processor (DSP) to be included as bus masters. The external
memory interface, APB bridge and any internal memory are the most common AHB slaves. Any other peripheral
in the system could also be included as an AHB slave. However, low-bandwidth peripherals typically reside on
the APB.
High-Performance High-Bandwidth
ARM Processor On-chip RAM
UART Timer
B
R
AHB
High-Bandwidth I APB
External memory
D
Interface
G
E
DMA bus
Master Keypa PIO
d

Fig1.5 A typical AHB-based microcontroller

A typical AMBA AHB system design contains the following components:


AHB master A bus master is able to initiate read and write operations by providing an address and control
information. Only one bus master is allowed to actively use the bus at any one time.
AHB slave A bus slave responds to a read or write operation within a given address-space range. The bus slave
signals back to the active master the success, failure or waiting of the data transfer.
AHB arbiter The bus arbiter ensures that only one bus master at a time is allowed to initiate data transfers. Even
though the arbitration protocol is fixed, any arbitration algorithm, such as highest priority or fair access can be
implemented depending on the application requirements. An AHB would include only one arbiter, although this
would be trivial in single bus master systems. Introduction to the AMBA Buses
AHB decoder The AHB decoder is used to decode the address of each transfer and provide a select signal for the
slave that is involved in the transfer. A single centralized decoder is required in all AHB implementations.

1.8 AMBA ASB PROTOCOL


ASB is the first generation of AMBA system bus. ASB sits above the current APB and implements the
features required for high-performance systems including:
• Burst transfers
• Pipelined transfer operation
• Multiple bus master.

High-Performance High-Bandwidth
ARM Processor On-chip RAM
UART Timer
B
ASB R
High-Bandwidth APB
I
External memory
Interface D
G
E
DMA bus
Keypa PIO
Master
d

Fig1.6 A typical ASB-based microcontroller


A typical AMBA ASB system may contain one or more bus masters as shown in fig 1.6. For example, at
least the processor and test interface. However, it would also be common for a Direct Memory Access (DMA) or
Digital Signal Processor (DSP) to be included as bus masters. The external memory interface, APB Bridge and
any internal memory are the most common ASB slaves. Any other peripheral in the system could also be included
as an ASB slave. However, low-bandwidth peripherals typically reside on the APB. An AMBA ASB system
design typically contains the following components:
ASB master A bus master is able to initiate read and write operations by providing an address and control
information. Only one bus master is allowed to actively use the bus at any one time.
ASB slave A bus slave responds to a read or write operation within a given address-space range. The bus slave
signals back to the active master the success, failure or waiting of the data transfer.
ASB decoder The bus decoder performs the decoding of the transfer addresses and selects slaves appropriately.
The bus decoder also ensures that the bus remains operational when no bus transfers are required.A single
centralized decoder is required in all ASB implementations.
ASB arbiter The bus arbiter ensures that only one bus master at a time is allowed to initiate data transfers. Even
though the arbitration protocol is fixed, any arbitration algorithm, such as highest priority or fair access can be
implemented depending on the application requirements.
An ASB would include only one arbiter, although this would be trivial in single bus master systems.

1.9 AMBA APB PROTOCOL


The APB is part of the AMBA hierarchy of buses and is optimized for minimal power consumption and
reduced interface complexity. The AMBA APB appears as a local secondary bus that is encapsulated as a single
AHB or ASB slave device. APB provides a low-power extension to the system bus which builds on AHB or ASB
signals directly. The APB Bridge appears as a slave module which handles the bus handshake and control signal
retiming on behalf of the local peripheral bus. By defining the APB interface from the starting point of the system
bus, the benefits of the system diagnostics and test methodology can be exploited.

High-Performance High-Bandwidth
ARM Processor On-chip RAM
UART Timer
B
R
AHB or ASB
High-Bandwidth I APB
External memory D
Interface
G
E
DMA bus
Master Keypa PIO
d

Fig1.7 A typical ASB-based microcontroller


The AMBA APB should be used to interface to any peripherals which are low bandwidth and do not
require the high performance of a pipelined bus interface. The latest revision of the APB is specified so that all
signal transitions are only related to the rising edge of the clock. This improvement ensures the APB peripherals
can be integrated easily into any design flow, with the following advantages:
• High-frequency operation easier to achieve
• Performance is independent of the mark-space ratio of the clock
• Static timing analysis is simplified by the use of a single clock edge
• No special considerations are required for automatic test insertion
• Many Application Specific Integrated Circuit (ASIC) libraries have a better selection of rising edge
registers
• Easy integration with cycle-based simulators.
These changes to the APB also make it simpler to interface it to the new AHB. An AMBA APB
implementation typically contains a single APB bridge which is required to convert AHB or ASB transfers into a
suitable format for the slave devices on the APB. The bridge provides latching of all address, data and control
signals, as well as providing a second level of decoding to generate slave select signals for the APB peripherals.
All other modules on the APB are APB slaves. The APB slaves have the following interface specification:
• Address and control valid throughout the access (un-pipelined)
• Zero-power interface during non-peripheral bus activity (peripheral bus is static when not in use)
• Timing can be provided by decode with strobe timing (un-clocked interface)
• write data valid for the whole access (allowing glitch-free transparent latch implementations).

1.10 Peripherals
Peripheral device are essential for embedded systems to interact with external world. A peripheral device
performs input and output functions for the processor by connecting to different devices or sensors that are
external to the chip. Each peripheral device usually performs a single function and may reside on-chip.
Peripherals range from a simple serial communication device to a more complex 802.11 wireless device.
All ARM peripherals are memory mapped—the programming interface is a set of memory-addressed
registers. The address of these registers is an offset from a specific peripheral base address.
Controllers are specialized peripherals that provide higher levels of functionality within an embedded
system. Two important types of controllers are memory controllers and interrupt controllers.

1.10.1 Memory Controllers


Memory controllers are used to transfer data from different types of memory through the processor bus.
Normally memory controller is configured in hardware to make certain memory devices to be active on power up.
These memory devices allow the boot code to be executed. Some memory devices are configured by software; for
example, when using DRAM, we first have to configure the memory timings and refresh rate before it can be used
for data accessing.
ARM7 core: Provides various control signals that can be used for memory Interface and Needs a separate memory
controller to perform the actual memory access control functions
– For example, address decoding, wait state generation, DRAM refresh cycle, etc.

D [31:0]
A [31:0]

ARM7 mclk nRAS DRAM


CORE Seq MEMORY nCAS DEVICE
mreq CONTROLLER nWE
r/w
mas[1:0] nOE
wait

Fig1.8 A typical Memory controller in ARM

Memory interface signals of the ARM7 core:

 A [31:0]: 32-bit address bus


 D [31:0]: 32-bit bidirectional data bus
 Dout [31:0]: for separate data out bus
 Din [31:0]: for separate data in bus
 r/w: Read (active low)/Write control signal
 mas [1:0]: Memory Access Size – 00 = Byte; 01 = Half-word; 10 = Word; 11 = Reserved
 mreq#: Memory request– Indicates that the next instruction cycle involves a memory access
 seq: Sequential Addressed Access Indicates that the address used in the next cycle will be either the same
or one operand (i.e., word) greater than the current address

Generation of the seq Signal

Fig1.9 Sequential Signal generation

The seq signal is automatically asserted whenever a memory address is obtained from the incrementer. The above
fig1.9 shows how the sequential signals are generated in ARM.
1.10.2 Interrupt Controllers
Whenever a peripheral or device requires the function of processor, it generates an interrupt signal to the
processor. An interrupt controller provides a programmable interrupt service that allows software to determine
which peripheral or device can interrupt the processor at any specific time by setting the appropriate bits in the
interrupt controller registers.
ARM processor has two types of interrupt controller:
1. The standard interrupt controller and
2. The vector interrupt controller (VIC).
The standard interrupt controller generates an interrupt signal when an external device requests interrupt
servicing. By writing program SIC can be configured to ignore or mask an individual interrupt device or set of
interrupt devices. The interrupt handler determines which device requires servicing by reading a device bitmap
register in the interrupt controller.
The VIC is more powerful than the standard interrupt controller because it prioritizes interrupts and
simplifies the determination of which device caused the interrupt. After associating a priority and a handler
address with each interrupt, the VIC only asserts an interrupt signal to the core if the priority of a new interrupt is
higher than the currently executing interrupt handler. Depending on its type, the VIC will either call the standard
interrupt exception handler, which can load the address of the handler for the device from the VIC, or cause the
core to jump to the handler for the device directly.
The VIC provides a software interface to the interrupt system. In a system with an interrupt controller,
software must determine the source that is requesting service and where its service routine is loaded. A VIC does
both of these in hardware. It supplies the starting address, or vector address, of the service routine corresponding
to the highest priority requesting interrupt source.
In an ARM system, two levels of interrupt are available:
Fast Interrupt reQuest (FIQ)
For fast, low latency interrupt handling.
Interrupt ReQuest (IRQ)
For more general interrupts generally, you only use a single FIQ source at a time in a system to provide a
true low-latency interrupt. This has the following benefits:
• You can execute the interrupt service routine directly without determining the source of the interrupt.
• It reduces interrupt latency. You can use the banked registers available for FIQ
interrupts more efficiently, because you do not require a context save.
The interrupt inputs must be level sensitive, active HIGH, and held asserted until the
interrupt service routine clears the interrupt. Edge-triggered interrupts are not compatible.
The interrupt inputs do not have to be synchronous to HCLK.
Note
The VIC does not handle interrupt sources with transient behavior. For example, an interrupt is asserted and then
deserted before software can clear the interrupt source. In this case, the CPU acknowledges the interrupt and
obtains the vectored address for the interrupt from the VIC, assuming that no other interrupt has occurred to
overwrite the vectored address. However, when a transient interrupt occurs, the priority logic of the VIC is not
set, and lower priority interrupts can interrupt the transient interrupt service routine, assuming interrupt nesting
is permitted.

1.11Embedded System Software


An embedded system is driven by embedded software. Figure 1.10 shows four Basic software units
required to control an embedded device. Each software layer in the system uses a higher level of abstraction to
separate the code from the hardware device.
The first code to be executed on the hardware is initialization code or Boot code and is specific to a
particular hardware device. It configures the minimum hardware parts before the operating system takes control.
The applications are controlled and hardware system resources are managed by the information provided
by Operating System. Many embedded systems do not require a full operating system but a simple task scheduler
that is either event or poll driven will take care.
The device drivers are the third components in the system .They provides a software interface to the
peripherals on the hardware device.

Hardware Device

Initialization Device Drivers

Operating System

Application

Fig1.10 Different layers for abstraction of software to work on Hardware

Lastly, an application performs one of the tasks required for a device. For example, a mobile phone
performs SMS application, even though there may be many other applications running on the same device,
controlled by the operating system.
The software components can run from ROM or RAM. ROM code that is fixed on the device (for
example, the boot code) is called firmware.

1.11.1. Initialization (boot) code


Every system has a boot-code. The boot-loader provides the foundation from which the other system
software is working. Modern boot-loaders have expanded their functionality and can be used as a powerful tool
for HW bring-up and diagnostics
To load a program into memory, you must first load a program into memory. The bootup process, often a
complex multistep sequence involving numerous substeps, solves this problem. Any boot-up process, including
booting up Windows, Linux, or an embedded RTOS (real-time operating system), begins with the application of
power to the system and the subsequent removal of system reset. During POR (power-on-reset) assertion, you may
have to reconfigure hardware peripherals if operational values differ from those of default settings. Embedded
microcontrollers, for example, often offer various hardware-reset-configuration schemes.
The boot loader is the first piece of code which is run when a processor comes out of reset. It is
responsible for the low-level initialisation of the hardware, after which it will load and transfer control to the
operating system. Boot loaders can be very simple, doing very little initialisation before passing control to the
operating system, or highly complex: providing functionality such as LCD display or booting from USB devices.
Numbers of administrative tasks are handled by the initialization code before an operating system starts
running. The different tasks can be grouped into three stages: initial hardware configuration, fault identification,
and booting.
Initially an image can be booted by setting up the target platform by configuring the initial hardware.
Although the target platform itself comes up in a standard configuration, this configuration normally requires
modification to satisfy the requirements of the booted image. For example, the memory map should be identified
by the memory system.
Fault identification (Diagnostics) codes are normally written with in the initialization code. This code
checks the system hardware target, if the target is in working order or not. It also tracks down standard
system-related issues. This type of testing is important for manufacturing since it occurs after the software product
is complete. The primary purpose of this code is fault identification and isolation.
Booting involves loading an image and handing control over to that image. This process itself can be
complicated if the system must boot different operating systems or different versions of the same operating
system.
Booting an image is the final phase, but first you must load the image. Loading an image involves
anything from copying an entire program including code and data into RAM, to just copying a data area
containing volatile variables into RAM. Once booted, the system hands over control by modifying the program
counter to point into the start of the image.
Sometimes, to reduce the image size, an image is compressed. The image is then decompressed either
when it is loaded or when control is handed over to it. Initializing or organizing memory is an important part of the
initialization code because many operating systems expect a known memory layout before they can start.
1.11.2 Operating System
Operating systems for embedded systems are different from their counterparts for desktops or servers in
many ways. First of all, the embedded system usually does not have the resources to run editors, compilers, linkers
and so on. Therefore, program development uses a different computer, often called the host system. The
embedded system which is supposed to run the program is then called the target system. Unlike in a desktop
computer, there is no permanent copy of the operating system in the target.
The target only has a small program that enables the user to download compiled code to the target and
executing it there. This program is usually called monitor. A monitor also has other functions like inspecting
memory or testing hardware. CubeMon is the monitor of the RoboCube and uses communicate with the monitor
over a serial interface and a terminal program. What’s the function of an embedded operating system? An
embedded system is lacking all the features that are commonly associated with a operating system. There is no
screen with a graphical user interface; there is no mouse, keyboard or other input devices. There are even no disk
drives or floppies which would imply a “Disk Operating System. And, of course, there are no 3D accelerators,
multimedia adapters, DVD-Drives and all the nifty features that no modern desktop computer could lack. Instead,
embedded systems often have a high number of input and output devices that don’t look too familiar. In robotics,
they’re called “sensors and “actuators. A sensor is a device that tells the system something about the state of the
world; an actuator is a device that changes this state. Sensors are i.e. touch sensors, light sensors, temperature
sensors, acceleration sensors, rotary pulse encoders, microphones or even video cameras. Actuators are motors,
lights, robot arms, paint shops or factory assembly lines. The one function of the embedded operating system is to
provide easy access to these devices to the programmer.
Applications for embedded systems are often in systems where the timed execution of programs is
important. In the motor control system of a modern car engine, a lot of sensory data has to be processed to obtain
optimal fuel efficiency, respecting environmental legislation, providing driving comfort and respecting the
specifications of the motor to ensure continued operation. On the other hand, the actuators have to be controlled
with very high time precision. An operating system that ensures these timing constraints is called a Real-time
Operating System (RTOS). Hard real-time systems ensure that no deadline is missed (and usually this fact can be
proven mathematically). Soft real-time systems can occasionally miss a deadline. Most commercially available
systems are soft real-time. CubeOS can control tasks that are either soft- or hard-real-time or have no timing
constraints at all. Another important domain for embedded systems is communication. I.e. in a modern car, the
different devices are no longer connected with dedicated wires but with a bus system. When the driver turns the
lever for turning on the headlights, a special data packet is sent from an embedded system in the dashboard to
another embedded system in the trunk and in the front of the car. There, the power of the headlights and position
lights is turned on. If the driver pushes the accelerator, the same system sends a data packet to the engine controller
to increase power and so on. All these transactions involve a high number of embedded systems that have to agree
upon a common communication protocol while maintaining their own real-time deadlines and not hindering the
real-time deadlines of other systems. I.e. the anti-lock-brake system’s communication has to succeed even if the
driver switches on the air condition at the same time.
The operating system to takes the control ones initialization process prepares the hardware. More than 50
operating systems are supported by ARM processor. There are two types of operating systems, they are: real-time
operating systems (RTOSs) and platform operating systems. Platform operating systems require a memory
management unit to manage large, non- real-time applications and tend to have secondary storage. The Linux
operating system is a typical example of a platform operating system.
These two categories of operating system are not mutually exclusive: there are operating systems that
use an ARM core with a memory management unit and have real-time characteristics. ARM has developed a set
of processor cores that specifically target each category.

1 .1 1 .3 A P P L I C A T I O N S
Applications are scheduled by OS. Which is the code dedicated for handling a pre-defined application.
An application carries a processing work; while the OS controls the environment. An embedded system can have
one active application or several applications running simultaneously.
ARM processors are found in various market segments, including networking, automation, mobile and
consumer devices, mass storage, and imaging processing. Within each segment ARM processors can be found in
multiple applications.
For example, the ARM processor is found in networking applications like home gateways, DSL modems
for high-speed Internet communication, and 802.11 wireless communications. The mobile device segment is the
largest application area for ARM processors because of mobile phones. ARM processors are also found in mass
storage devices such as hard drives and imaging products such as inkjet printers—applications that are cost
sensitive and high volume.
In contrast, ARM processors are not found in applications that require leading-edge high performance.
Some of them are
 Industrial control
 Medical systems
 Access control
 Point-of-sale
 Communication gateway
 Embedded soft modem
 General purpose applications
Example.
Automotive Infotainment
Increasingly, the electronics functionality embedded in a vehicle is becoming a key decision criterion for
buyers. ARM and its Partners have been present in the area of infotainment for a long period of time, with a
number of high volume platforms (the Ford Sync being one example) being powered by ARM® technology. The
success of the ARM architecture in wireless applications has led to the availability of many ingredients necessary
to build a successful telematics or infotainment product. OEMs are looking for compelling, power-efficient
hardware platforms for high-end Navigation and Multimedia systems. From the software perspective, the
investment that ARM is making in ensuring the availability of optimized web browsers and Adobe Flash 10
support for the ARM architecture enables car suppliers to offer the same web experience inside a vehicle that
users are familiar with on a PC or a Smartphone.

Fig 1.11 Typical application of ARM

Description: Ford and Microsoft have collaborated on the "Sync" in-car communications and entertainment
system which enables you to stay connected to your handheld devices, even when you’re on the road. Sync is a
voice-activated, hands-free, in-car communications and entertainment system which fully integrates your mobile
phone and digital media player.
One touch of the 'telephone button' on the steering wheel as shown in fig 1.11is all it takes to make a
call. Names and numbers in a mobile phone’s address book will be transferred wirelessly and automatically to
the vehicle. Users will be able to access their mobile phones or digital music players – including genre,
album, artist and song title, via voice commands. Sync will also host, nearly any digital media player, including
the Apple iPod®, Microsoft Zune, PlaysForSure players and most USB storage devices.
The first models which will offer the "Sync" system (As an option on 2008-models) will be; Ford,
Lincoln and Mercury models: Edge, Explorer, Five Hundred, Focus, Freestyle, Fusion, Milan, MKX, MKZ,
Montego, Mountaineer, and Sport-Trac.
Features:
Voice-activated, hands-free calling
Uninterrupted connections
Audible text messages
Advanced calling features
Voice-activated music
Instant voice recognition
Ring tone support
Automatic phonebook transfer
Multilingual intelligence

Chapter 2:
ARM PROCESSOR FUNDAMENTALS
2.1 INTRODUCTION
Now that we have an idea of the ARM7 system, now it is better understand about the programmer’s model
and operating modes. This chapter covers dataflow model for the program developer’s view. We will study about
data flow model, register file structure. The ARM7 is a load-and-store architecture, so in order to perform any data
processing instructions the data has first to be moved from the memory store into a central set of registers, the data
processing instruction has to be executed and then the data is stored back into memory. We will also look at the
Structure of Cpsr (current program status register). The CPSR contains a number of flags which report and control
the operation of the ARM7 CPU, Register banks in ARM7.We will also at the different operating modes of ARM7.
We will then cover about Exceptions in ARM7.
One of the most interesting features of the ARM7 is instruction execution, where instructions are executed
in 3 Staged pipeline. A three-stage pipeline is the simplest form of pipeline and does not suffer from the kind of
hazards such as read-before-write seen in pipelines with more stages. Finally we will discuss about Vector table.
Vector table gives the idea about execution of exception interrupt.

2.2 Data flow model of ARM:


A programmer can think of an ARM core as functional units connected by data buses, as shown in
Figure 2.1, where, the arrows represent the flow of data, the lines represent the buses, and the boxes represent
either an operation unit or a storage area. The figure shows not only the flow of data but also the abstract
components that make up an ARM core.
Data enters the processor core through the Data bus. The data may be an instruction to execute or a data
item. Figure 2.1 shows a Von Neumann implementation of the ARM— data items and instructions share the same
bus. In contrast, Harvard implementations of the ARM use two different buses.
The instruction decoder translates instructions before they are executed. Each instruction executed
belongs to a particular instruction set.
Figure 2.1 ARM core dataflow model.

The ARM processor uses load-store architecture. That is it has two instruction types for transferring
data in and out of the processor: load instructions copy data from memory to registers and the store instructions
copy data from registers to memory. There is no data processing instruction, which directly works on data in
memory. Thus, data processing is performed only with the registers. Data items are placed in the register file—a
storage bank made up of 32-bit register- Since the ARM core is a 32-bit processor, most instructions treat the
registers as storing; signed or unsigned 32-bit values. The sign extend hardware converts signed 8-bit and 16- r
numbers to 32-bit values as they are read from memory and placed in a register.
ARM instructions have two source registers, Rn and Rm, and a single result storing- destination register,
Rd. Source operands are read from the register file using the internal buses A and B, respectively.
The ALU (arithmetic logic unit) or MAC (multiply-accumulate unit) takes the renter values Rn and Rm
from the A and B buses and computes a result. Data processor: instructions write the result in Rd directly to the
register file. Load and store instruction use the ALU to generate an address to be held in the address register and
broadcast on the Address bus.
One important feature of the ARM is that register Rm alternatively can be preprocessed in the barrel
shifter before it enters the ALU. Together the barrel shifter and ALU can calculate a wide range of expressions
and addresses. After passing through the functional units, the result in Rd is written back to the register file using
the ALU bus. For load and store instructions the incrementer updates the address register before the core reads or
writes the next register value from or to the next sequential memory location. The processor continues executing
instructions until an exception or interrupt changes the normal execution flow. Now that you have an overview of
the processor core we'll take a more detailed look at some of the key components of the processor: the registers,
the current program status register (cpsr), and the pipeline.

2.2 The Register Files of ARM


The ARM7 is a load-and-store architecture, so in order to perform any data processing instructions the
data has first to be moved from the memory store into a central set of registers, the data processing instruction has
to be executed and then the data is stored back into memory.
For Example to execute Add R4,R1,R2 , first data from memory M1 is moved on to register R1using
Mov M1,R1,then second data from memory M2 is moved on to register R2 using Mov M2,R2,Now addition
instruction Add R4,R1,R2 is executed and the result is stored in register R4.Finally the sum (R1+R2) is moved
on to memory M3 using Mov R4,M3. Fig 2.2 gives the clear idea about this process.

Fig 2.2 Load and Store Architecture


R0
R1
R2
R3
R4
R5
15 User Registers +PC R6
R7
R8
R9
R10
R11
R12
R13 used as the stack pointer R13
R14 is the Link register R14
R14 used as Program counter R15(PC)

Current Program Status register CPSR


Fig 2.3 User Mode Register Model

General-purpose registers hold either data or an address. They are identified with the letter R prefixed to
the register number. For example, register 4 is given the label R4. Figure 2.3 shows the active registers available
in user mode—a protected mode normally
There are up to 18 active registers: The central sets of registers are a bank of 16 user data registers R0 –
R15 and 2 processor status registers. Data registers are visible to the programmer as R0 to R15.Each of these
registers is 32 bits wide and R0 – R12 are user registers in that they do not have any specific other function. The
Registers R13 – R15 do have special functions in the CPU.
Stack Pointer (SP):
R13 is used as the stack pointer (SP) and stores the top of stack, though this has only been defined as a
programming convention. Unusually the ARM instruction set does not have PUSH and POP instructions so stack
handling is done via a set of instructions that allow loading and storing of multiple registers in a single operation.
Link register (LR):
R14 is called the link register (LR). When a call is made to a function the return address is automatically
stored in the link register and is immediately available on return from the function. This allows quick entry and
return into a ‘leaf’ function (a function that is not going to call further functions). If the function is part of a branch
(i.e. it is going to call other functions) then the link register must be preserved on the stack (R13).
Program counter (PC):
R15 is the program counter (PC) and contains the address of the next instruction to be fetched by the
processor. Interestingly, many instructions can be performed on R13- R15 as if they were standard user registers.
Current Program Status Register (CPSR):
In addition to the register bank there is an additional 32 bit wide register called the ‘current program status
register’ (CPSR). The CPSR contains a number of flags which report and control the operation of the ARM7 CPU.

2.3 Structure of Current Program Status Register (CPSR)

In addition to the register bank there is an additional 32 bit wide register called the ‘current program status
register’ (CPSR). The CPSR as shown in fig 2.4contains a number of flags which report and control the operation
of the ARM7 CPU.

31 30 29 28 27 8 7 6 5 4 3 2 1 0

N Z C V I F T M M M M M
4 3 2 1 0
Conditional Code flags Interrupt Enable operating mode
Negative IRQ
Zero FIQ IRQ
Carry Thumb instruction set System
oVerflow User
Undefined instruction
Abort

Fig2. 4 Current Program Status Register and Flags reference FIQ

The top four bits (28 to 31) of the CPSR contain the condition codes which are set by the CPU. The
condition codes report the result status of a data processing operation. From the condition codes you can tell if a
data processing instruction generated a negative (N), zero (Z), carry(C) or overflow (V) result. The lower eight
bits (0 to 7) in the CPSR contain flags which may be set or cleared by the application code. Bits 7 and 8 are the
I and F bits. These bits are used to enable and disable the two interrupt sources which are external to the ARM7
CPU. You should be careful when programming these two bits because in order to disable either interrupt source
the bit must be set to ‘1’ not ‘0’ as you might expect. Bit 5 is the THUMB bit.

2.3.1 Condition Code flags: There are four bits, Bit 28 to bit 31 are kept aside for representing the status of a
data processing operation. Functions of these flags are as follows.
Negative Flag Bit (31):
This flag bit indicates the result of arithmetic or logical operation is negative. It is set 1 if result of
arithmetic or logical operation is negative otherwise it is reset to 0.
Zero Flag (Bit 30):
Set and reset according to the flag setting operation is zero or not. It records zero condition. It is set 1 if
result of arithmetic or logical operation is zero otherwise it is reset to 0.
Carry Flag (Bit 29):
This can also be called as unsigned overflow Flag. Set and reset according to the flag setting operation
results in a carry or not. It is set 1 if result of 32 bit arithmetic operation generates carry; otherwise it is reset to 0.
oVerflow Flag (Bit 28):
This can also be called as Signed overflow Flag. Set and reset according to the flag setting operation in a
signed over flow or not. It is set 1 if result of 32 bit signed integer arithmetic operation generates carry; otherwise
it is reset to 0.
For example 0x7Effffff+0x010000001=0x80000000 sets overflow flag because sum of two positive 32 bit
integers is a negative 32 bit integer.

2.3.2 Application code (control) flags:


There are eight bits Bit 0 to bit 7 are kept aside for representing the status of an application code. Two bits
(7 and 6) are used for interrupt enable, one bit (5) for THUMB instruction set and five bits (4 to 0) for operating
mode selection. Function of these flags are as follows.

IRQ (Interrupt request) Bit 7:


This bit is used for disabling IRQ interrupts. when this bit is set to 1, IRQ interrupts are disabled.
FIQ (Fast Interrupt reQuest) Bit 6:
This bit is used for disabling FIQ interrupts. when this bit is set to 1, FIQ interrupts are disabled.
T (Thumb instruction) Bit 5:
This bit is used for reflecting operating state. When set to 1, the processor is executing in Thumb State.
When reset to 0, the processor is executing in ARM State.

Operating mode Selection bits (bits 4 to 0):


These bits are used for selection of operating modes and modes are selected as follows.

CPSR Mode Bits


MODE
M4 M3 M2 M1 M0
1 0 1 1 1 Abort
1 0 0 0 1 Fast interrupt request (FIQ)
1 0 0 1 0 Interrupt request (IRQ)
1 0 0 1 1 Supervisor
1 1 1 1 1 System
1 1 0 1 1 Undefined
1 0 0 0 0 User

2.4 Different Modes of ARM


The ARM7 has seven different operating modes. Your application code will normally run in the user
mode with access to the register bank R0 –R15 and the CPSR as already discussed. However in response to an
exception such as an interrupt, memory error or software interrupt instruction the processor will change modes.
When this happens the registers R0 – R12 and R15 remain the same but R13 (LR ) and R14 (SP) are replaced by a
new pair of registers unique to that mode. This means that each mode has its own stack and link register. In addition
the fast interrupt mode (FIQ) has duplicate registers for R7 – R12. This means that you can make a fast entry into an
FIQ interrupt without the need to preserve registers onto the stack. Each of the modes except user mode has an
additional register called the “saved program status register”. If your application is running in user mode when an
exception occurs the mode will change and the current contents of the CPSR will be saved into the SPSR. The
exception code will run and on return from the exception the context of the CPSR will be restored from the SPSR
allowing the application code to resume execution. The operating modes are listed below.
The processor mode determines which registers are active and the access rights to the cpsr register itself.
Each processor mode is either privileged or nonprivileged: A privileged mode allows full read-write access to the
cpsr. Conversely, a nonprivileged mode only allows read access to the control field in the cpsr but still allows
read-write access to the condition flags.
There are seven processor modes in total: six privileged modes namely Abort, Fast interrupt
request(FIQ), Interrupt request (IRQ), Supervisor, System, and Undefined and one nonprivileged mode User.

User:
This mode is used to run the application code. Once in user mode the CPSR cannot be written to and
modes can only be changed when an exception is generated.
FIQ: (Fast Interrupt reQuest):
This supports high speed interrupt handling. Generally it is used for a single critical interrupt source in a
system
IRQ: (Interrupt ReQuest):
This supports all other interrupt sources in a system
Supervisor:
A “protected” mode for running system level code to access hardware or run OS calls. The ARM 7
enters this mode after reset.
Abort:
If an instruction or data is fetched from an invalid memory region, an abort exception will be generated
Undefined Instruction:
If a FETCHED opcode is not an ARM instruction, an undefined instruction exception will be generated.

2.5 Banked registers


Fig2.5 ARM7 Banked registers

The ARM7 CPU has six operating modes which are used to process exceptions. The shaded registers are banked
memory that is “switched in” when the operating mode changes. The SPSR register is used to save a copy of the
CPSR when the switch occurs.
Figure 2.5 shows all 37 registers in the register file. Of those, 20 registers are hidden from a program at
different times. These registers are called banked registers and are identified by the shading in the diagram. They
are available only when the processor is in a particular mode; for example, abort mode has banked registers
rl3_abt, rl4_abt and spsr_abt. Banked registers of a particular mode are denoted by an underline character
post-fixed to the mode mnemonic or _mode.
Every processor mode except user mode can change mode by writing directly to the mode bits of the
cpsr. All processor modes except system mode have a set of associated banked registers that are a subset of the
main 16 registers. A banked register maps one-to- one onto a user mode register. If you change processor mode, a
banked register from the new mode will replace an existing register.
For example, when the processor is in the interrupt request mode, the instructions you execute still access
registers named rl3 and rl4. However, these registers are the banked registers rl3_irq and rl4_irq. The user mode
registers rl3_usr and rl4_usr are not affected by the instruction referencing these registers. A program still has
normal access to the other registers rO to rl2.
The User registers R0-R7 are common to all operating modes. However FIQ mode has its own R8 –R14
that replace the user registers when FIQ is entered. Similarly, each of the other modes have their own R13 and
R14 so that each operating mode has its own unique Stack pointer and Link register. The CPSR is also common to
all modes. However in each of the exception modes, an additional register - the saved program status register
(SPSR), is added. When the processor changes the current value of the CPSR stored in the SPSR, this can be
restored on exiting the exception mode.
2.6 3-Staged Pipelining of ARM-7
At the heart of the ARM7 CPU is the instruction pipeline. The pipeline is used to process instructions
taken from the program store. On the ARM 7 a three-stage pipeline is used.

• Typical pipeline stages:


Fetch Decode Execute
• Pipeline design allows effective throughput increase to one instruction per clock cycle
•Allows the next instruction to be fetched while still decoding or executing the previous instructions.
A three-stage pipeline is the simplest form of pipeline and does not suffer from the kind of hazards such as
read-before-write seen in pipelines with more stages. The pipeline has hardware independent stages that execute
one instruction while decoding a second and fetching a third. The pipeline speeds up the throughput of CPU
instructions so effectively that most ARM instructions can be executed in a single cycle. The pipeline works most
efficiently on linear code. As soon as a branch is encountered, the pipeline is flushed and must be refilled before
full execution speed can be resumed. As we shall see, the ARM instruction set has some interesting features which
help smooth out small jumps in your code in order to get the best flow of code through the pipeline. As the pipeline
is part of the CPU, the programmer does not have any exposure to it. However, it is important to remember that the
PC is running eight bytes ahead of the current instruction being executed, so care must be taken when calculating
offsets used in PC relative addressing.

1st FETCH DECODE EXECUTE

2nd FETCH DECODE EXECUTE

3rd FETCH DECODE EXECUTE

T1 T2 T3 T4 T5

Fig 2.6 The ARM7 three-stage pipeline

Pipelining can be explained with respect to fig 2.6 as follows.


For Example: Consider the instructions

Add r0, r1, r2-------------------instruction 1


Sub r0, r1, r2 -------------------instruction 2
Cmp r3, r4 -------------------instruction 3

During time T1, the instruction1 Add r0, r1, r2 is fetched from memory. Now during time T2,
instruction1 is decoded and instruction2 Sub r0, r1, r2 is fetched from memory. During time T3, instruction1 is
executed i.e. content of register r1 and r2 are added and sum is moved on to register r0, at the same time
instruction2 is decoded and instruction3 Cmp r3, r4 is fetched from memory. Similarly During time T4 new
instruction4 is fetched, instruction2 is executed i.e. content of r2 is subtracted from r1 and the result is stored in r0
and instruction3 is decoded. This process continues and at any point of time ARM7 performs three functions,
Fetch, Decode and execute.

2.7 Introduction to Exception, Interrupts and Vector tables


Exceptions arise whenever the normal flow of a program has to be halted temporarily, for example, to
service an interrupt from a peripheral. Before attempting to handle an exception, the ARM7TDMI processor
preserves the current processor state so that the original program can resume when the handler routine has finished.
When an exception occurs, for example an IRQ exception, the following actions are taken: First the
address of the next instruction to be executed (PC + 4) is saved into the link Register(LR). Then the CPSR is copied
into the SPSR of the exception mode that is about to be entered (i.e. SPSR_irq). The PC is then filled with the
address of the exception mode interrupt vector. In the case of the IRQ mode this is 0x00000018(See Vector Table
2.6). At the same time the mode is changed to IRQ mode, which causes R13 and R14 to be replaced by the IRQ R13
and R14 registers. On entry to the IRQ mode, the I bit in the CPSR is set, causing the IRQ interrupt line to be
disabled. If you need to have nested IRQ interrupts, your code must manually re-enable the IRQ interrupt and push
the link register onto the stack in order to preserve the original return address. From the exception interrupt vector
your code will jump to the exception ISR. The first thing your code must do is to preserve any of the registers
R0-R12 that the ISR will use by pushing them onto the IRQ stack. Once this is done you can begin processing the
exception.
When an exception occurs the CPU will change modes and jump to the associated interrupt vector. Once
your code has finished processing the exception it must return back to the user mode and continue where it left off.
However the ARM instruction set does not contain a “return” or “return from interrupt” instruction so manipulating
the PC must be done by regular instructions. The situation is further complicated by there being a number of
different return cases. First of all, consider the SWI instruction. In this case the SWI instruction is executed, the
address of the next instruction to be executed is stored in the Link register and the exception is processed. In order
to return from the exception all that is necessary is to move the contents of the link register into the PC and
processing can continue. However in order to make the CPU switch modes back to user mode, a modified version
of the move instruction is used and this is called MOVS . Hence for a software interrupt the return instruction is

MOVS R15, R14; Move Link register into the PC and switch modes.

However, in the case of the FIQ and IRQ instructions, when an exception occurs the current instruction
being executed is discarded and the exception is entered. When the code returns from the exception the link register
contains the address of the discarded instruction plus four. In order to resume processing at the correct point we
need to roll back the value in the Link register by four. In this case we use the subtract instruction to deduct four
from the link register and store the results in the PC. As with the move instruction, there is a form of the subtract
instruction which will also restore the operating mode. For an IRQ, FIQ or Prog Abort, the return instruction is:
SUBS R15, R14, #4 In the case of a data abort instruction, the exception will occur one instruction after
execution of the instruction which caused the exception. In this case we will ideally enter the data abort ISR, sort
out the problem with the memory and return to reprocess the instruction that caused the exception. In this case we
have to roll back the PC by two instructions i.e. the discarded instruction and the instruction that caused the
exception. In other words subtract eight from the link register and store the result in the PC. For a data abort
exception the return instruction is SUBS R15, R14,#8.Once the return instruction has been executed, the modified
contents of the link register are moved into the PC, the user mode is restored and the SPSR is restored to the CPSR.
Also, in the case of the FIQ or IRQ exceptions, the relevant interrupt is enabled. This exits the privileged mode and
returns to the user code ready to continue processing.
At the end of the exception the CPU returns to user mode and the context is restored by moving the SPSR
to the CPSR
2.7.1 Entering an exception:
The ARM7TDMI processor handles an exception as follows:
1. Preserves the address of the next instruction in the appropriate LR. When the exception entry is from
ARM state, the ARM7TDMI processor copies the address of the next instruction into the LR, current PC+4 or
PC+8 depending on the exception.
When the exception entry is from Thumb state, the ARM7TDMI processor writes the value of the PC into
the LR, offset by a value, current PC+4 or PC+8 depending on the exception, that causes the program to resume
from the correct place on return.
The exception handler does not have to determine the state when entering an exception. For example, in
the case of a SWI, always returns to MOVS PC, r14_svc the next instruction regardless of whether the SWI was
executed in ARM or Thumb state.
2. Copies the CPSR into the appropriate SPSR.
3. Forces the CPSR mode bits to a value that depends on the exception.
4. Forces the PC to fetch the next instruction from the relevant exception vector.
The ARM7TDMI processor can also set the interrupt disable flags to prevent otherwise
Unmanageable nestings of exceptions.

Note
Exceptions are always entered in ARM state. When the processor is in Thumb state and an exception occurs, the
switch to ARM state takes place automatically when the exception vector address is loaded into the PC. An
exception handler might change to Thumb state but it must return to ARM state to allow the exception handler to
terminate correctly.

2.7.2 Leaving an exception:


When an exception is completed, the exception handler must:
1. Move the LR, minus an offset to the PC. The offset varies according to the type of exception,
2. Copy the SPSR back to the CPSR.
3. Clear the interrupt disable flags that were set on entry.

Note
The action of restoring the CPSR from the SPSR automatically resets the T bit to whatever value it held
immediately prior to the exception.
.
2.8 Vector table
When an exception or interrupt occurs, the processor suspends normal execution and starts loading
instructions from the exception vector table (see Table 2.6). Each vector table entry contains a form of branch
instruction pointing to the start of a specific routine:
Reset vector is the location of the first instruction executed by the processor when power is applied. This
instruction branches to the initialization code.
Undefined instruction vector is used when the processor cannot decode an instruction.
Software interrupt vector is called when you execute a SWI instruction. The SWI instruction is frequently used as
the mechanism to invoke an operating system routine.
Prefetch abort vector occurs when the processor attempts to fetch an instruction from an address without the
correct access permissions. The actual abort occurs in the decode stage.
Data abort vector is similar to a prefetch abort but is raised when an instruction attempts to access data memory
without the correct access permissions.
Interrupt request vector is used by external hardware to interrupt the normal execution flow of the processor. It
can only be raised if IRQs are not masked in the cpsr.

Table 2.6 Vector Table

Address Exception Mode on entry I state on entry F state on entry


0x00000000 Reset Supervisor Set(Masked) Set(Masked)
0x00000004 Undefined instruction Undefined Set(Masked) Unchanged
0x00000008 Software interrupt Supervisor Set(Masked) Unchanged
0x0000000C Prefetch Abort Abort Set(Masked) Unchanged
0x00000010 Data Abort Abort Set(Masked) Unchanged
0x00000014 Reserved Reserved - -
0x00000018 IRQ IRQ Set(Masked) Unchanged
0x0000001C FIQ FIQ Set(Masked) Set(Masked)

Chapter 3
ARM 7 Instruction Set
3.1 Introduction
Now that we have an idea of the ARM7 architecture, programmers model and operating modes we need to
take a look at its instruction set or rather sets. Since all our programming examples are written in C there is no need
to be an expert ARM7 assembly programmer. However an understanding of the underlying machine code is very
important in developing efficient programs. Before we start our overview of the ARM7 instructions it is important
to set out a few technicalities. The ARM7 CPU has two instruction sets: the ARM instruction set which has 32-bit
wide instructions and the THUMB instruction set which has 16-bit wide instructions. In the following section the
use of the word ARM means the 32-bit instruction set and ARM7 refers to the CPU.
One of the most interesting features of the ARM instruction set is that every instruction may be
conditionally executed. In a more traditional microcontroller the only conditional instructions are conditional
branches and maybe a few others like bit test and set. However in the ARM instruction set the top four bits of the
operand are compared to the condition codes in the CPSR. If they do not match then the instruction is not executed
and passes through the pipeline as a NOP (no operation).

3.2 Conditional Execution.


Every ARM (32 bit) instruction is conditionally executed. The top four bits are ANDed with the CPSR
condition codes. If they do not match the instruction is executed as a NOPSo it is possible to perform a data
processing instruction, which affects the condition codes in the CPSR. Then depending on this result, the following
instructions may or may not be carried out. The basic assembler instructions such as MOV or ADD can be prefixed
with sixteen conditional mnemonics, which define the condition code states to be tested for.
Each ARM (32- bit) instruction can be prefixed by one of 16 condition codes. Hence each instruction has
16 different variants.

3.2.1 Condition Codes


The condition codes and use are shown in table 3.1. If the condition is omitted in instructions, the AL
(always) condition is used to specify that the instruction should always execute.

Table 3.1 Condition codes and their meanings.


Opcode Mnemonic Extension Meaning Condition flag state
[31:28]
0000 EQ Equal Z= =1
0001 NE Not Equal Z= =0
0010 CS/HS Carry set/Unsigned higher or Same C= =1
0011 CC/LO Carry Clear/Unsigned Lower C= =0
0100 MI Minus/Negative N= =1
0101 PL Plus/Positive or Zero N= =0
0110 VS Overflow V= =1
0111 VC No Overflow V= =0
1000 HI Unsigned Higher (C= =1) AND (Z= =0)
1001 LS Unsigned lower or Same (C= =1) OR (Z= =0)
1010 GE Signed Greater than or Equal N= =V
1011 LT Signed Less than N!=V
1100 GT Signed Greater than (Z= =0) AND (N= =V)
1101 LE Signed Less than or Equal (Z= =1) OR (N! =V)
1110 AL Always(Unconditional) Not Applicable
1111 NV Never Absolute

3.3 Addressing, Operands and Directives


In general, using R15 (PC) as the destination register is not appropriate for most instructions. Many instructions
will have unpredictable behavior if R15 is the destination. The ARM supports instruction set extensions by
reserving certain bit combinations in the operand fields of otherwise valid instructions. The assembler will ensure
that these bit combinations are not used, but these must be avoided when hand-coding instructions.
The notation SBZ means “should be zeros”, SBO means “should be ones”.
3.4. Instruction Set:
The main instruction groups of the ARM instruction set fall into six different categories, Branching, Data
Processing, Data Transfer, Block Transfer, Multiply and Software Interrupt.
3.4.1 Data Processing Instructions
The data processing instructions manipulate data within registers. They are move instructions, arithmetic
instructions, logical instructions, comparison instructions, and multiply instructions. Most data processing
instructions can process one of their operands using the barrel shifter.
If you use the S suffix on a data processing instruction, then it updates the flags in the cpsr. Move and
logical operations update the carry flag C, negative flag N, and zero flag Z. The carry flag is set from the result of
the barrel shift as the last bit shifted out. The Nflag is set to bit 31 of the result. The Zflag is set if the result is zero.
The general form for all data processing instructions is shown fig 3.1below. Each instruction has a result
register and two operands. The first operand must be a register, but the second can be a register or an immediate
value.
Op Code Operands 32 bit Shift

Cond Op S R1, R2, R3 Shift

Conditional Execution Enable Conditional Flag


Fig 3.1 general structure of data processing instruction
The general structure of the data processing instructions allows for conditional execution, a logical shift of
up to 32 bits and the data operation all in the one cycle
3.4.2 BARREL SHIFTER
The general structure of the data processing instructions allows for conditional execution, a logical shift of
up to 32 bits and the data operation all in the one cycle. The operand2 can be more than just a register or immediate
value; it an also be a register Rm that has been preprocessed by the barrel shifter prior to being used by a data
processing instruction.
Data processing instructions are processed within the arithmetic logic unit (ALU). A unique and
powerful feature of the ARM processor is the ability to shift the 32-bit binary pattern in one of the source registers
left or right by a specific number of positions before it enters the ALU. This shift increases the power and
flexibility of many data processing operations.
There are data processing instructions that do not use the barrel shift, for example, the MUL (multiply),
CLZ (count leading zeros), and QADD (signed saturated 32-bit add) instructions.Pre-processing or shift occurs
within the cycle time of the instruction. This is particularly useful for loading constants into a register and
achieving fast multiplies or division by a power of 2.

Rn Rm
No Pre-Processing

Pre-Processing

Barrel Shifter

Shifted Operand1 (N)

Arithmetic and Logic Unit

Figure 3.2 Barrel Shifter and ALU Rd


Figure 3.2 shows the data flow between the ALU and the barrel shifter. To illustrate the barrel shifter we
will take the example3.5.3 ADD R0, R2, R3, LSL#1. Here Register R2 enters the ALU without any pre-
processing. We apply a logical shift left (LSL) to register R3 in barrel shifter before moving it to the ALU. This is
the same as applying the standard C language shift operator to the register. The ADD instruction adds the shift
operator result N with R2 value in ALU and moves the result into register R0. N represents the result of the LSL
operation described in Table 3.2.
Figure 3.3 illustrates a logical shift left by one. For example, the contents of bit 0 are shifted to bit 1. Bit
0 is cleared. The C flag is updated with the last bit shifted out of the register. This is bit (32 — y) of the original
value, where y is the shift amount. When y is greater than one, then a shift by y positions is the same as a shift by
one position executed y times.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
Carry

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

Figure 3.3 Logical Shift by one bit

Table 3.2 Barrel Shifter operations


Mnemonic Description Shift Result Shift amount
y
LSL logical shift left xLSLy #0-31 or Rs
LSR logical shift right xLSRy (unsigned)* y #1-32 or Rs
ASR arithmetic right shift xASRy (signed)x^> y #1-32 or Rs
R0R rotate right xRORy ((unsigned)x » y) \ {x « (32 - y)) #1-31 or Rs
RRX rotate right extended xRRX (c flag « 31) | ((unsigned)* » 1) none

3.5 Shifter Operands (Addressing Modes):


The shifter operand is represented by the least-significant 12 bits of the instruction. It can take one of
eleven forms, as listed below. For illustration, each form has one or more examples based on the ADD instruction
(ADD <Rn>, <shifter_operand>). For instructions that use shifter operands, the C flag update is dependent on the
form of the operand used. The shift operation syntax for data processing instruction are shown in table 3.3

Table 3.3 Barrel Shift operation Syntax for data processing instructions
Shift operations Syntax
Immediate #immediate
Register Rm
Logical Shift Left by immediate Rm, LSL #shift_imm
Logical Shift Left by Register Rm, LSL Rs
Logical Shift Right by immediate Rm, LSR #shift_imm
Logical Shift Right by Register Rm, LSR Rs
Arithmetically Shift Right by immediate Rm, ASR #shift_imm
Arithmetically Shift Right by Register Rm, ASR Rs
Rotate Right by immediate Rm, ROR #shift_imm
Rotate Right by Register Rm, ROR Rs
Rotate Right with extend Rm, RRX

3.5.1 Immediate Operand.


Immediate values are signified by a leading # symbol. The operand is actually stored in the instruction as
an 8-bit value with a 4-bit rotation code. The resultant value is the 8bit value rotated right 0-30 bits (twice the
rotation code amount), as illustrated below. Only values that can be represented in this form can be encoded as
immediate operands.
The assembler will make substitutions of comparable instructions if it makes it possible to create the
desired immediate operand. For example, CMP R0, #-1 is not a legal instruction since it is not possible to specify
-1 (0xFFFFFFFF) as an immediate value, but it can be replaced by CMN R0, #1. If the rotate value is non-zero,
the C flag is set to bit 31 of the immediate value, otherwise it is unchanged.
Syntax: #<immediate>
Example 3.1: ADD R0, R1, #256 ; R0 = R1 + 256
Here operand 2 is an immediate value(256)
Let R0=0X00000000
R1=0x00000011
Immediate Data=256
Now R0 = R1 + Immediate Data
= 0x00000011 + 256
= 0x00000267
3.5.2 Register Operand.
The register value is used directly. The C flag is unchanged. Note that this is actually a form of the
Register Operand, Logical Shift Left by Immediate option (see below) with a 0-bit shift.
Syntax: <Rm>
Example 3.2: ADD R0, R1, R2 ; R0 = R1 + R2
Here operand 2 is a register R2
Let R0 = 0X0000
R1 = 0x0011
R2 = 0x0022
Now R0 = R1 + R2
= 0x0011 + 0x0022
= 0x0033
3.5.3 Logical Shift Left by Immediate Operand.
The register value is shifted left by an immediate value in the range 0-31. Note that a shift of zero is
identical to the encoding for a register operand with no shift. The C flag will be updated with the last value shifted
out of Rm unless the shift count is 0.
Syntax: <Rm>, LSL #<immediate>
Example 3.3: ADD R0, R2, R3, LSL#1 ; R0 = R2 + (R3 << 1)

Here operand 2 is a shifted register (R3, LSL#1)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800

First R3 value is logically shifted left (LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

After LSL R3 = 0x00001000

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0

Now R0 = R2 + (R3 << 1)

= 0x00000011+0x00001000
= 0x00001011

3.5.4 Logical Shift Left by Register Operand.


The register value is shifted left by a value contained in a register. The C flag will be updated with the
last value shifted out of Rm unless the value in Rs is 0.
Syntax: <Rm>, LSL <Rs>
Example 3.4: ADD R0, R2, R3, LSL R4 ; R0 = R2 + (R3 << (R4))

Here operand 2 is a shifted register (R3, LSL R4)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800
R4=0x00000001

First R3 value is logically shifted left (LSL) by on bit, the content of R4(0x01)

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

After LSL R3 = 0x00001000

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0

Now R0 = R2 + (R3 << 1)

= 0x00000011+0x00001000
= 0x00001011

3.5.5 Logical Shift Right by Immediate Operand.


The register value is shifted right by an immediate value in the range 1-32. The C flag will be updated
with the last value shifted out of Rm.

Syntax: <Rm>, LSR #<immediate>


Example 3.5: ADD R0, R2, R3, LSR#1 ; R0 = R2 + (R3 >> 1)

Here operand 2 is a shifted register (R3, LSR#1)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800

First R3 value is logically shifted Right (LSR) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

After LSL R3 = 0x00000400

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 4 0 0

Now R0 = R2 + (R3 >> 1)

= 0x00000011+0x00000400
= 0x00000411
3.5.6 Logical Shift Right by Register Operand.
The register value is shifted right by a value contained in a register. The C flag will be updated with the
last value shifted out of Rm unless the value in Rs is 0.

Syntax: <Rm>, LSR <Rs>


Example 3.6: ADD R0, R2, R3, LSR R4 ; R0 = R2 + (R3 >> R4)

Here operand 2 is a shifted register (R3, LSR R4)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800
R4=0x00000001
First R3 value is logically shifted Right (LSR) by on bit
Cy
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

After LSL R3 = 0x00000400

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 4 0 0

Now R0 = R2 + (R3 >> 1)

= 0x00000011+0x00000400
= 0x00000411
3.5.7 Arithmetic Shift Right by Immediate Operand.
The register value is arithmetically shifted right by an immediate value in the range 1-32. The
arithmetic shift fills from the left with the sign bit, preserving the sign of the number. The C flag will be updated
with the last value shifted out of Rm.
Syntax: <Rm>, ASR #<immediate>
Example 3.7: ADD R0, R2, R3, ASR #1 ; R0 = R2 + (R3 >> 1)

Here operand 2 is a shifted register (R3, ASR #1)

Let R0=0x00000000
R2=0x00000011
R3=0x80000800
R4=0x00000001
First R3 value is arithmetically shifted Right (ASR) by on bit
Cy

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

Cy

After ASR R3 = 0x00000400

1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
C 0 0 0 0 4 0 0

Now R0 = R2 + (R3 >> 1)

= 0x00000011+0xC0000400
= 0xC0000411

3.5.8 Arithmetic Shift Right by Register Operand.


The register value is arithmetically shifted right by a value contained in a register. The arithmetic shift
fills from the left with the sign bit, preserving the sign of the number. The C flag will be updated with the last
value shifted out of Rm unless the value in Rs is 0.
Syntax: <Rm>, ASR <Rs>
Example 3.8: ADD R0, R2, R3, ASR #R4 ; R0 = R2 + (R3 >> R4)

Here operand 2 is a arithmetically shifted register (R3, ASR R4)

Let R0=0x00000000
R2=0x00000011
R3=0x80000800

First R3 value is logically shifted Right (ASR) by on bit

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0

Carry

After ASR R3 = 0x00000400

1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
C 0 0 0 0 4 0 0

Now R0 = R2 + (R3 >> 1)

= 0x00000011+0xC0000400
= 0xC0000411

3.5.9 Rotate Right by Immediate Operand.


The register value is rotated right by an immediate value in the range 1-31. [A rotate value of 0 in this
instruction encoding will cause an RRX operation to be performed.] The C flag will be updated with the last value
shifted out of Rm.

Syntax: <Rm>, ROR #<immediate>


Example 3.9: ADD R0, R2, R3, ROR#1 ; R0 = R2 + (R3 >> 1)

Here operand 2 is a shifted register (R3,ROR#1)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800

First R3 value is logically Rotated Right (ROR) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
Carry

After ROR R3 = 0x00000400

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 4 0 0

Now R0 = R2 + (R3 >> 1)


= 0x00000011+0x00000400
= 0x00000411
3.5.10 Rotate Right by Register Operand.
The register value is rotated right by a value contained in a register. The C flag will be updated with the
last value shifted out of Rm unless the value in Rs is 0.

Syntax: <Rm>, ROR <Rs>


Example 3.10: ADD R0, R2, R3, ROR R4 ; R0 = R2 + (R3 >> R4)

Here operand 2 is a shifted register (R3,ROR R4)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800
R4=0x00000001
First R3 value is logically Rotated Right (ROR) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
Carry

After ROR R3 = 0x00000400

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 4 0 0

Now R0 = R2 + (R3 >> 1)

= 0x00000011+0x00000400
= 0x00000411
3.5.11 Rotate Right with Extend Operand.
The register value is rotated right by one bit through the C flag, i.e. C Rm[0], Rm[31] C,
Rm[30] Rm[29], etc.

Syntax: <Rm>, RRX


Example 3.11: ADD R0, R2, R3, RRX ; R0 = R2 + (R3<<1)

Here operand 2 is a shifted register (R3, ROR #1)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800
R4=0x00000001

First R3 value is logically Rotated Right (RRX) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
Carry

After ROR R3 = 0x00000400

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 4 0 0

Now R0 = R2 + (R3 >> 1)


= 0x00000011+0x00000400
= 0x00000411

3.6.1 MOVE instructions


MOV – Move
This instruction loads a 32 bit value into the destination register, from another register, a shifted register,
or an immediate value
Syntax: MOV{<cond>}{S} <Rd>, <shifter_operand>>

dest = op_1
if(cond)
Rd shifter_operand
if(S==1 and Rd==R15)
Flags updated if S used and Rd is not R15 (PC): N, Z, C

You can specify the same register for the effect of a NOP instruction, or you can shift the same register if you
choose:
MOV R0, R0 ; R0 = R0... NOP instruction

MOV R0, R0, LSL#3 ; R0 = R0 * 8

If R15 is the destination, the program counter or flags can be modified. This is used to return to calling code, by
moving the contents of the link register into R15:
MOV PC, R14 ; Exit to caller

MOVS PC, R14 ; Exit to caller preserving flags (not 32-bit compliant)
MOV performs a move to a register from another register or an immediate value.
MOV R1, R0 ; R1 R0
MOV R1, R0, LSL #2 ; R1 R0 * 4
MOV R1, #1 ; R1 0x0000001

If the S bit is set and the destination is R15 (the PC), the SPSR is also copied to CPSR. This form of the instruction
used to return from an exception mode

Examples 3.12:
MOV R1, R0 ; R1 R0
This instruction will Move (Copy) the content of operand two R0 into operand one R1, without changing the
content of operand two.
Let
R1 = 0x00000000
R0 = 0x00000011
After the Execution of this instruction
R1 = 0x00000011
R0 = 0x00000011
Examples 3.13:
MOV R1, R0, LSL #2 ; R1 R0 << 2
This instruction will Move (Copy) the content of logically shifted by two bit value of operand two R0 into
operand one R1, without changing the content of operand two.
Let
R1 = 0x00000000
R0 = 0x00000018

First R0 value is logically shifted left (LSL) by two bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0

After LSL#2, shifted value is 0x0000000C


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 0 0 0 3 0
After the Execution of this instruction
R1 = 0x00000030
R0 = 0x00000018

Examples 3.14:
MOV R1, #1 ; R1 0x0000001
This instruction will Move the immediate value 00000001 into operand one R1.
Let
R1 = 0x00000000
After the Execution of this instruction
R1 = 0x00000001

MVN: Move Negative

Syntax: MVN{<cond>}{S} <Rd>, <shifter_operand> dest = !op_1

if(cond)
Rd NOT shifter_operand
if(S==1 and Rd==R15)
Flags updated if S used and Rd is not R15 (PC): N, Z, C
This instruction loads a value into the destination register, from another register, a shifted register, or an
immediate value. The difference is the bits are inverted prior to moving, thus you can move a negative value into
a register. Due to the way this works (two's complement), you want to move one less than the required number:
MVN R1, R0 ; R1 NOT R0
MVN R1, R0, LSL #2 ; R1 <- NOT (R0 * 4)
MVN R0, #4 ; R0 = -5
Examples 1.5:
MVN R1, R0 ; R1 NOT R0
This instruction will Move the complimented value of the content of operand two(R0) into operand one (R1),
without changing the content of operand two.
Let
R1 = 0x00000000
R0 = 0x00000008
After the Execution of this instruction
R1 = 0xFFFFFFF7(Two’s Complement of 9) = -9
R0 = 0x00000008
Examples 3.16:
` MVN R1, R0, LSL #2 ; R1 NOT R0 << 2
This instruction will Move(Copy) the complimented value of logically shifted by two bit of operand two R0 into
operand one R1, without changing the content of operand two.
Let
R1 = 0x00000000
R0 = 0x00000018

First R0 value is logically shifted left(LSL) by two bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0

After LSL#2, shifted value is 0x0000000C

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0

0 0 0 0 0 0 0 C

Now this shifted value is complemented bit wise

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1
F F F F F F F 3

After the Execution of this instruction


R1 = 0xFFFFFFF3(Two’s Complement of D) = -D
R0 = 0x00000018
Examples 3.17:
MVN R1, #1 ; R1 NOT( 0x0000001 )
This instruction will Move the complemented value of immediate data 00000001 into operand one R1.
Let
R1 = 0x00000000
After the Execution of this instruction
R1 = 0x0000000E(Two’s Complement of 2) = -2

MRS and MSR instructions.


As noted in the ARM7 architecture section, the CPSR and the SPSR are CPU registers, but are not part of
the main register bank. Only two ARM instructions can operate on these registers directly. The MSR and MRS
instructions support moving the contents of the CPSR or SPSR to and from a selected register. For example, in
order to disable the IRQ interrupts the contents of the CPSR must be moved to a register, the “I” bit must be set by
ANDing the contents with 0x00000080 to disable the interrupt and then the CPSR must be reprogrammed with
the new value.

MRS
R0

CPSR/SPSR

R1
5
MSR
R0
CPSR/SPSR

R1
5
The CPSR and SPSR are not memory-mapped or part of the central register file.The only instructions which
operate on them are the MSR and MRS instructions.These instructions are disabled when the CPU is in USER
mode.
The MSR and MRS instructions will work in all processor modes except the USER mode.So it is only possible to
change the operating mode of the process, or to enable or disable interrupts, from a privileged mode. Once you
have entered the USER mode you cannot leave it, except through an exception, reset, FIQ, IRQ or SWI instruction

MRS – Move PSR into General-Purpose Register


Syntax:
MRS{<cond>} <Rd >, CPSR
MRS{<cond>} <Rd >, SPSR

if(cond)
Rd CPSR/SPSR
Flags updated: None
Usage and Examples:
Moves the value of CPSR or the current SPSR into a general-purpose register.

MRS R0, CPSR

MSR – Move to Status Register from GPR Register


Syntax:
MSR{<cond>} CPSR_<fields>, #<immediate>
MSR{<cond>} CPSR_<fields>, <Rm>
MSR{<cond>} SPSR_<fields>, #<immediate>
MSR{<cond>} SPSR_<fields>, <Rm>

if(cond)
CPSR/SPSR immediate/register value
Flags updated: N/A

Usage and Examples:


Moves the value of a register or immediate operand into the CPSR or the current SPSR. This instruction
is typically used by supervisory mode code. The <fields> indicate which fields of the CPSR/SPSR be written to
should be allowed to be changed. This limits any changes just to the fields intended by the programmer. The
allowed fields are;
c sets the control field mask bit (bit 16)
x sets the extension field mask bit (bit 17)
s sets the status field mask bit (bit 18)
f sets the flags field mask bit (bit 19)
One or more fields may be specified.

3.2 Arithmetic instructions


Mnemonic Meaning
ADD Add
ADC Add with carry
SUB Subtract
RSB Reverse Subtract
SBC Subtract with carry
RSC Reverse Subtract with carry

3.2.1 ADD - Addition


Syntax:
ADD {<cond>}{S} <Rd>, <Rn>, <shifter_ operand>

if (cond)
Rd. Rn + shifter_operand
Flags updated if S used: N, Z, V, C
This instruction will add the two operands, placing the result in the destination register. Operand 1 is a register,
operand 2 can be a register, shifted register, or an immediate value:

ADD R0, R1, R2 ; R0 = R1 + R2


ADD R0, R1, #256 ; R0 = R1 + 256
ADD R0, R2, R3, LSL#1 ; R0 = R2 + (R3 << 1)
The addition may be performed on signed or unsigned numbers.
Example 3.18:
ADD R0, R1, R2 ; R0 = R1 + R2
Here operand 2 is a register R2
Let R0 = 0X0000
R1 = 0x0011
R2 = 0x0022
Now R0 = R1 + R2
= 0x0011 + 0x0022
= 0x0033
Examples 3.19:
ADD R0, R1, #256 ; R0 = R1 + 256

Here operand 2 is an immediate value(256)

Let R0=0X0000
R1=0x0011
Immediate Data=256
Now R0 = R1 + Immediate Data
= 0x0011 + 256
= 0x0267
Examples 3.20:
ADD R0, R2, R3, LSL#1 ; R0 = R2 + (R3 << 1)

Here operand 2 is a shifted register (R3,LSL#1)

Let R0=0x00000000
R2=0x00000011
R3=0x00000800

First R3 value is logically shifted left(LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

After LSL R3 = 0x00001000

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0

Now R0 = R2 + (R3 << 1)

= 0x00000011+0x00001000
= 0x00001011

3.2.2 ADC: Addition with Carry

Syntax:
ADC{<cond>}{S} <Rd>, <Rn>, <shifter operand>
if(cond)
Rd Rn + shifter operand + C
Flags updated if S used: N, Z, V, C
This instruction will add the two operands, placing the result in the destination register. It uses a carry bit, so can
add numbers larger than 32 bits. Operand 1 is a register, operand 2 can be a register, shifted register, or an
immediate value:
Example 3.21.
If 64-bit numbers are stored in R1:0 and R3:2, their sum can be stored in R5:4 as shown below.

ADDS R4, R2, R0 ;add least significant words


ADC R5, R3, R1 ;add most significant words plus carry

Let R5 = 0x00000000
R4 = 0x00000000 Destination registers
R3 = 0x00000800 MSB Word
R2 = 0x00000310 LSB Word 64 bit data1
R1 = 0x00000300 MSB Word
R0 = 0x00000250 LSB Word 64 bit data2
dadadata2
Now After ADDS R4, R2, R0
R4 = R2 + R0 = 00000310 + 00000250 =00000560 with Carry = 0
Now after ADC R5, R3, R1
R5 = R3 + R1+ Carry =00000800 +00000300 + 0 =00000B00

3.2.3 SUB: Subtraction

Syntax:
SUB{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
:
if(cond)
Rd . Rn - shifter_operand
Flags updated if S used: N, Z, V, C
This instruction will subtract operand two from operand one, placing the result in the destination register.
Operand 1 is a register, operand 2 can be a register, shifted register, or an immediate value:

The subtraction may be performed on signed or unsigned numbers.


SUB R0, R1, R2 ; R0 = R1 - R2
SUB R0, R1, #0x00008000 ; R0 = R1 - 0x00008000
SUB R0, R2, R3, LSL#1 ; R0 = R2 - (R3 << 1)

Examples 3.22:

SUB R0, R1, R2 ; R0 = R1 - R2


Here operand 2 is a register R2
Let R0 = 0x00000000
R1 = 0x00000033
R2 = 0x00000011
Now R0 = R1 - R2
= 0x0000 0033 - 0x0000 0011
= 0x0000 0011
Examples 3.23:
SUB R0, R1, #0x00008000;
Here operand 2 is an immediate data 0x8000
Let R0 = 0x00000000
R1 = 0x0000FFFF
Immediate Data = 0x00008000
Now R0 = R1- 0x00008000
= 0x0000FFFF - 0x00008000
= 0x00007FFF
Examples 3.24:
SUB R0, R2, R3, LSL#1 ; R0 = R2 AND (R3 << 1)

Here operand 2 is a shifted register (R3, LSL#1)

Let R0 = 0X00000000
R2 = 0x000015F1
R3 = 0x000008E0

First R3 value is logically shifted left(LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0

After LSL R3 = 0x000011C0


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0
0 0 0 0 1 1 C 0

Now R0 = R2 - (R3 << 1)


= 0x000015F1 - 0x000011C0
= 0x00000431

3.2.4 SBC: Subtraction with carry

Syntax:
SBC{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
if(cond)
Rd Rn - shifter_operand – NOT C
Flags updated if S used: N, Z, V, C
This instruction will subtract the two operands, placing the result in the destination register. It uses the carry bit to
represent 'borrow', so can subtract numbers larger than 32bits. SUB and SBC generate the Carry flag the wrong
way around, if a borrow is required then the carry flag is UNSET. Thus, this instruction requires a NOT Carry flag
- it inverts the flag automatically during the instruction. The subtraction may be performed on signed or unsigned
numbers.
SBC R0, R1, R2 ; R0 = R1 - R2- NOT C
SBC R0, R1, #0x00008000 ; R0 = R1 - 0x00008000 - NOT C
SBC R0, R2, R3, LSL#1 ; R0 = R2 - (R3 << 1) – NOT C

Examples 3.25:

SBC R0, R1, R2 ; R0 = R1 - R2- NOT C


Here operand 2 is a register R2
Let R0 = 0x00000000
R1 = 0x00000033
R2 = 0x00000011
C =0
Now R0 = R1 - R2 – NOT C
= 0x0000 0033 - 0x0000 0011-1
= 0x0000 0021
Examples 3.26:
SBC R0, R1, #0x00008000 ; R0 = R1 - 0x00008000 - NOT C
Here operand 2 is an immediate data 0x00008000
Let R0 = 0x00000000
R1 = 0x0000FFFF
C =1
Immediate Data = 0x00008000
Now R0 = R1- 0x00008000-1
= 0x0000FFFF - 0x00008000-1
= 0x00007FFF
Examples 3.27:
SBC R0, R2, R3, LSL#1 ; R0 = R2 - (R3 << 1) – NOT C

Here operand 2 is a shifted register (R3, LSL#1)

Let R0 = 0X00000000
R2 = 0x000015F1
R3 = 0x000008E0
C = 0
First R3 value is logically shifted left(LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0
After LSL R3 = 0x000011C0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0
0 0 0 0 1 1 C 0

Now R0 = R2 - (R3 << 1)- NOT C


= 0x000015F1 - 0x000011C0-1
= 0x00000430

3.2.5 RSB: Reverse Subtract

Syntax:
RSB{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
if(cond)
Rd shifter_operand - Rn
Flags updated if S used: N, Z, V, C
This instruction will subtract operand one from operand two, placing the result in the destination
register.Please note that normal subtraction operand2 is subtracted from operand1 Operand 1 is a register,
operand 2 can be a register, shifted register, or an immediate value: Normally this instruction is used for
“negating or 2’s Complementing” of numbers.

RSB R0, R1, R2 ; R0 = R2 - R1


RSB R0, R1, #256 ; R0 = 256-R1
RSB R0, R2, R3,LSL#1 ; R0 = (R3 << 1) - R2
The subtraction may be performed on signed or unsigned numbers.
Examples 3.28:

RSB R0, R1, R2 ; R0 = R2 – R1


Here operand 2 is a register R2
Let R0 = 0x00000000
R1 = 0x00000033
R2 = 0x00000000
Now R0 = R2 – R1
= 0x00000000 - 0x00000033
= 0xFFFFFFCD= -R1= -33

3.2.5 RSC: Reverse Subtract with Carry

Syntax:
RSC{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
:
if(cond)
Rd shifter_operand - Rn - NOT Carry
Flags updated if S used: N, Z, V, C
This instruction will subtract “Operand one and complemented value of Carry from Operand two” placing the
result in the destination register. Please note that normal subtraction operand2 is subtracted from operand1.
Operand 1 is a register, operand 2 can be a register, shifted register, or an immediate value:

RSC R0, R1, R2 ; R0 = R2 - R1- NOT C


RSC R0, R1, #256 ; R0 = 256-R1 - NOT C
RSC R0, R2, R3,LSL#1 ; R0 = (R3 << 1) - R2 - NOT C
The subtraction may be performed on signed or unsigned numbers.
Examples 3.29:

RSC R0, R1, R2 ; R0 = R2 – R1- NOT C


Here operand 2 is a register R2
Let R0 = 0x00000000
R1 = 0x00000033
R2 = 0x00000000
Carry = 1
Now R0 = R2 – R1 – NOT Carry
= 0x00000000 - 0x00000033- 0
= 0xFFFFFFCD= -R1= -33

3.3 Logical instructions

3.3.1 AND : LOGICAL AND

Syntax:
AND{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
if(cond)
Rd Rn AND shifter_operand
Flags updated if S used: N, Z, C

This instruction will perform a logical AND operation between the two operands, placing the result in the
destination register; this is useful for masking the bits you wish to work on. Operand 1 is a register; operand 2 can
be a register, shifted register, or an immediate value:
AND R0, R1, R2 ; R0 = R1 AND R2
AND R0, R1, #0x00008000 ; R0 = R1 AND 0x8000
AND R0, R2, R3, LSL#1 ; R0 = R2 AND (R3 << 1)

Example: 3.30

AND R0, R1, R2 ; R0 = R1 AND R2


Here operand 2 is a register R2
Let R0 = 0x00000000
R1 = 0x00000011
R2 = 0x00000033
Now R0 = R1 AND R2 R1 = 0000 0000 0000 0000 0000 0000 0001 0001
= 0x0000 0011 AND 0x0000 0033 R2 = 0000 0000 0000 0000 0000 0000 0011 0011
= 0x0000 0011 R3 = 0000 0000 0000 0000 0000 0000 0001 0001
Examples 3.31:
AND R0, R1, #0x00008000; mask bit D15 of R1
Here operand 2 is an immediate data 0x8000
Let R0 = 0X00000000
R1 = 0x0000FFFF
Immediate Data = 0x00008000
Now R0 = R1 AND 0x00008000
= 0x0000FFFF and 0x00008000
= 0x00001000
Examples 3.32:
AND R0, R2, R3, LSL#1 ; R0 = R2 AND (R3 << 1)
Here operand 2 is a shifted register (R3, LSL#1)
Let R0 = 0X00000000
R2 = 0x000000F1
R3 = 0x000008E0

First R3 value is logically shifted left(LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0

After LSL R3 = 0x11C0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0
0 0 0 0 1 1 C 0

Now R0 = R2 AND (R3 << 1) R2 = 0000 0000 0000 0000 0000 0000 1111 0001
= 0x000000F1 AND 0x000011C0 R3 = 0000 0000 0000 0000 0001 0001 1100 0000
= 0x000000C0 R0 = 0000 0000 0000 0000 0000 0000 1100 0000

3.3.2 ORR: Logical OR

Syntax:
ORR{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
:
if(cond)
Rd . Rn OR shifter_operand
Flags updated if S used: N, Z, C
This instruction will perform a logical OR between the two operands, placing the result in the destination register;
this is useful for setting certain bits to be set. Operand 1 is a register, operand 2 can be a register, shifted register,
or an immediate value:
ORR R0, R1, R2 ; R0 = R1 OR R2
ORR R0, R1, #0x8000 ; R0 = R1 OR 0x8000
ORR R0, R2, R3, LSL#1 ; R0 = R2 OR (R3 << 1)

Example 3.33:
ORR R0, R1, R2 ; R0 = R1 OR R2
Here operand 2 is a register R2
Let R0 = 0X0000
R1 = 0x0011
R2 = 0x0033
Now R0 = R1 OR R2 R1 = 0000 0000 0001 0001
= 0x0011 OR 0x0033 R2 = 0000 0000 0011 0011
= 0x0033 R3 = 0000 0000 0011 0011
Examples 3.34:

ORR R0, R1, #0x8000; sets bit D15 of R1


Here operand 2 is an immediate data 0x8000
Let R0 = 0X0000
R1 = 0x0111
Immediate Data = 0x8000
Now R0 = R1 OR 0x8000 R1 = 0000 0001 0001 0001
= 0x0111 OR 0x8000 Data = 1000 0000 0000 0000
= 0x8111 R0 = 1000 0001 0001 0001
Examples 3.35:

ORR R0, R2, R3, LSL#1 ; R0 = R2 OR (R3 << 1)


Here operand 2 is a shifted register (R3, LSL#1)
Let R0 = 0X0000
R2 = 0x00F1
R3 = 0x08E0

First R3 value is logically shifted left(LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0

After LSL R3 = 0x11C0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0
0 0 0 0 1 1 C 0
Now R0 = R2 OR (R3 << 1) R2 = 0000 0000 1111 0001
= 0x00F1 OR 0x11C0 R3 = 0001 0001 1100 0000
= 0x11F1 R0 = 0001 0001 1111 0001

1.3.2 EOR : Logical Exclusive-OR


Syntax:
EOR{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
if(cond)
Rd Rn X-OR shifter_operand
Flags updated if S used: N, Z, C
This instruction will perform a logical Exclusive OR between the two operands, placing the result in the
destination register; this is useful for inverting certain bits. Operand 1 is a register, operand 2 can be a register,
shifted register, or an immediate value:
EOR R0, R1, R2 ; R0 = R1 X-OR R2
EOR R0, R1, #0x8000 ; R0 = R1 X-OR 0x8000
EOR R0, R2, R3, LSL#1 ; R0 = R2 X-OR (R3 << 1)

Example 3.36:

EOR R0, R1, R2 ; R0 = R1 X-OR R2


Here operand 2 is a register R2
Let R0 = 0X00000000
R1 = 0x00000011
R2 = 0x00000033
Now R0 = R1 X-OR R2 R1 = 0000 0000 0001 0001
= 0x0011 X-OR 0x0033 R2 = 0000 0000 0011 0011
= 0x00000022 R3 = 0000 0000 0010 0010
Examples 3.37:

EOR R0, R1, #0x8000; Toggles bit D15 of R1


Here operand 2 is an immediate data 0x8000
Let R0 = 0X00000000
R1 = 0x00000111
Immediate Data = 0x8000
Now R0 = R1 X-OR 0x8000 R1 = 0000 0001 0001 0001
= 0x0111X- OR 0x8000 Data = 1000 0000 0000 0000
= 0x8111 R0 = 1000 0001 0001 0001
Examples 3.38:

EOR R0, R2, R3, LSL#1 ; R0 = R2 X-OR (R3 << 1)


Here operand 2 is a shifted register (R3, LSL#1)
Let R0 = 0X0000
R2 = 0x00F1
R3 = 0x08E0

First R3 value is logically shifted left(LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0

After LSL R3 = 0x11C0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0
0 0 0 0 1 1 C 0
Now R0 = R2 X-OR (R3 << 1) R2 = 0000 0000 1111 0001
= 0x00F1 OR 0x11C0 R3 = 0001 0001 1100 0000
= 0x1121 R0 = 0001 0001 0011 0001

1.3.3 BIC : Bit Clear


Syntax:
BIC{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
:
if(cond)
Rd Rn AND NOT shifter_operand
Flags updated if S used: N, Z
BIC is a way to clear bits within a word, a sort of reverse OR. Operand two is a 32 bit mask. If a bit is set in the
mask, it will be cleared. Unset mask bits indicate bits to be left alone. This instruction is particularly useful when
clearing status bits and is frequently used to change interrupt masks in the cpsr. This instruction performs the
logical AND operation between operand1and complemented value of operand2.
Operand 1 is a register, operand 2 can be a register, shifted register, or an immediate value:
BIC R0, R1, R2 ; R0 = R1 & ~ R2
BIC R0, R1, #256 ; R0 = R1 & ~256
BIC R0, R1, R2, LSL#1 ; R0 = R1 & ~ (R2 << 1)

Example 3.39:
BIC R0, R1, R2 ; R0 = R1 & ~ R2
Here operand 2 is a register R2
Let R0 = 0X00000000
R1 = 0x0000000F
R2 = 0x00000005
Now R0 = R1 & ~ R2 R1= 0000 0000 0000 0000 0000 0000 0000 1111
= 0x0000000F & ~ 0x00000005 ~R2 = 1111 1111 1111 1111 1111 1111 1111 1010
= 0x0000000A R0 = 0000 0000 0000 0000 0000 0000 0000 1010

Examples 3.40:
BIC R0, R1, #0x00000005 ; R0 = R1 & ~ #00000005

Here operand 2 is an immediate value(00000005)

Let R0=0X00000000
R1=0x00000005
Immediate Data=00000011
Now R0 = R1 &~ Immediate Data R1 = 0000 0000 0000 0000 0000 0000 0001 0001
= 0x00000011 &~ 00000005 ~R2 = 1111 1111 1111 1111 1111 1111 1111 1010
= 0x0000010 R0 = 0000 0000 0000 0000 0000 0000 0001 0000
Examples 3.41:
BIC R0, R1, R2, LSL#1 ; R0 = R1 & ~ (R2 << 1)

Here operand 2 is a shifted register (R2,LSL#1)

Let R0=0x00000000
R1=0x00000011
R2=0x00000800

First R2 value is logically shifted left (LSL) by on bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

After LSL R2 = 0x00001000


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0

Now Complement of R2 is equal to 0xFFFFEFFF


Now R0 = R1 &~ (R2 << 1)

= 0x00000011&~0xFFFFEFFF
= 0x00000011

1.4 Comparison instructions


The comparison instructions are used to compare or test a register with a 32-bit value. They update the
cpsr flag bits according to the result, but do not affect other registers. After the bits have been set, the information
can then be used to change program flow by using conditional execution.. You do not need to apply the
S suffix for comparison instructions to update the flags.
3.4.1 CMP - Compare
Syntax:
CMP{<cond>} <Rn>, <shifter_operand>
:
if(cond)
Rn - shifter_operand
Flags updated: N, Z, V, C
The CMP instruction performs a subtraction, but does not store the result. The flags are always updated. CMP
allows you to compare the contents of a register with another register or an immediate value, updating the status
flags to allow conditional execution to take place. It performs a subtraction, but does not store the result
anywhere. Instead, the flags are updated as appropriate.The flags refer to operand one compared against operand
two. Thus, the GT suffix will be executed if operand one is greater than operand two.

Obviously you do not need to explicitly specify the S suffix as the point of the command is to update the status
flags... If you do specify it, it will be ignored.Operand 1 is a register, operand 2 can be a register, shifted register,
or an immediate value:

CMP R0, #1 ;Z=1 if R0=1, N=0 if R0>1


CMP R0, R1 ;Z=1 if R0=R1, N=0 if R0>R1
CMP R0, R1,LSL#2 ;Z=1 if R0= (R1,LSL#2), N=0 if R0> (R1,LSL#2)
3.4.2 CMN – Compare Negated
Syntax:
CMN{<cond>} <Rn>, <shifter_operand>
if(cond)
Rn - shifter_operand
Flags updated: N, Z, V, C
CMN is the same as CMP, except it allows you to compare against small negative values (the logical NOT
of operand two) that would be hard to implement otherwise; such as -1 to end a list.
So to compare with -1 we would use: The CMN instruction performs an addition of the operands
(equivalent to a subtraction of the negative), but does not store the result. The flags are always updated.
Examples 3.41:
CMN R0, #1 ;Z=1 if R0 = -1
CMN R0, R1 ;Z=1 if R0 = -R1
CMN R0, R1,LSL#2 ;Z=1 if R0= - (R1,LSL#2)

3.4.3 TST - Test


Syntax:
TST{<cond>} <Rn>, <shifter_operand>
:
if(cond)
Rn AND shifter_operand
Flags updated: N, Z, C
This instruction performs a non-destructive AND (the result is not stored).It performs Logical AND
operation between operand1 and operand2 without storing the result, but only the flags are always updated. The
most common use for this instruction is to determine the value of an individual bit of a register. Operand 1 is a
register, operand 2 can be a register, shifted register, or an immediate value: Operand one is the data word to test
and operand two is a bit mask. After testing, the Zero flag will be set upon a match, or otherwise clear.
Examples 3.42:

TST R0, R1 ; Z = 1 if R0= R1,otherwise reset to 0


TST R0, #0x8000 ; Z = 1 if 15 th bit of R0= 0,otherwise reset to 0

3.4.4 TST – Test Equivalence


Syntax:
TEQ{<cond>} <Rn>, <shifter_operand>
if(cond)
Rn XOR shifter_operand
Flags updated: N, Z, C
This instruction performs a non-destructive bit-wise XOR (the result is not stored).The flags are always
updated. The most common use for this instruction is to determine if two operands are equal without affecting the
V flag. It can also be use to tell if two values have the same sign, since the N flag will be the logical XOR of the
two sign bits. Operand 1 is a register, operand 2 can be a register, shifted register, or an immediate value:

TEQ R0, #0x8000 ;sets Z = 1 if R0 contains the value 0x00008000


TEQ R0, R1 ;sets N = 1 if signs are different
Examples 3.43:
TEQ R0, #0x8000

Let R0 = 0x00000033
Then R0 value is X-OR with immediate data 0x00008000

R0 = 0x00000033 = 0000 0000 0000 0000 0000 0000 0011 0011


Data= 0x80008000 = 1000 0000 0000 0000 1000 0000 0000 0000
Result=0x00008033 = 1000 0000 0000 0000 1000 0000 0011 0011

Since the result is not Zero, Z=0,N=1

1.5 Multiply instructions:


The multiply instructions multiply the contents of a pair of registers and, depending upon the instruction,
accumulate the results in with another register. The long multiplies accumulate onto a pair of registers
representing a 64-bit value. The final result is placed in a destination register or a pair of registers. The number of
cycles taken to execute a multiply instruction depends on the processor implementation. For some
implementations the cycle timing also depends on the value in Rs.
In addition to the barrel shifter, the ARM7 has a built-in Multiply Accumulate Unit (MAC).The MAC
supports integer and long integer multiplication. The integer multiplication instructions support multiplication of
two 32-bit registers and place the result in a third 32-bit register (modulo32). A multiply-accumulate instruction
will take the same product and add it to a running total. Long integer multiplication allows two 32-bit quantities to
be multiplied together and the 64-bit result is placed in two registers. Similarly a long multiply and accumulate is
also available.
Mnemonic Meaning Resolution
MUL Multiply 32 bit result
MLA Multiply and accumulate 32 bit result
These two instructions are different from the normal arithmetical instructions in that there are restrictions on
the operands, namely:

1. All operands, and the destination, must be given as simple registers.


2. You cannot use immediate values or shifted registers for operand two.
3. The destination and operand one must be different registers.
4. Lastly, you cannot specify R15 as the destination.

3.5.1 MUL – Multiply


Syntax:
MUL{<cond>}{S} <Rd >, <Rm>, <Rs>
if(cond)
Rd =. Rs * Rm
Flags updated if S used: N, Z (C is unpredictable)
This instruction performs a 32x32 multiply operation, and stores a 32-bit result. Since only the least
significant 32-bits are stored, the result is the same for signed and unsigned numbers.
Examples 3.44:
MUL R2,R1,R0
Let R0 = 0x00000002
R1 = 0x00000004
R2 = 0x00000000
After the execution of this instruction
R2 = R1 x R0
= (0x00000004) X (0x00000002)
= 0x00000008

MLA : Multiply and Accumulate


MLA behaved that same as MUL, except that the value of operand three is added to the result. This is
useful for running totals.
Syntax:
MLA{<cond>}{S} <Rd >, <Rm>, <Rs>, <Rn>
:
if(cond)
Rd = (Rs * Rm) + Rn
Flags updated if S used: N, Z (C is unpredictable)

This instruction performs a 32x32 multiply operation, then stores the sum of Rn and the 32-bit multiplication
result to Rd. Since only the least significant 32-bits of the multiplication are used, the result is the same for signed
and unsigned numbers.

Examples 3.44:
MLA R3,R2,R1,R0
The instruction below adds the product of R1 and R2 to R0 and stores the result in R3

Let R0 = 0x00000002
R1 = 0x00000004
R2 = 0x00000006
R3 = 0x00000000

After the execution of this instruction


R3 = (R1 x R2) + R0
= (0x00000004 X 0x00000006) + 0x00000002
= 0x0000001A
MAC Unit
In addition to the barrel shifter, the ARM7 has a built-in Multiply Accumulate Unit (MAC). The MAC
supports integer and long integer multiplication. The integer multiplication instructions support multiplication of
two 32-bit registers and place the result in a third 32-bit register (modulo32). A multiply-accumulate instruction
will take the same product and add it to a running total. Long integer multiplication allows two 32-bit quantities to
be multiplied together and the 64-bit result is placed in two registers. Similarly a long multiply and accumulate is
also available.

Mnemonic Meaning Resolution


MUL Multiply 32 bit result
MULA Multiply accumulate 32 bit result
UMULL Unsigned multiply 64 bit result
UMLAL Unsigned multiply accumulate 64 bit result
SMULL Signed multiply 64 bit result
SMLAL Signed multiply accumulate 64 bit result

Branching Instructions:
The basic branch instruction (as its name implies) allows a jump forwards or backwards of up to 32 MB.
A modified version of the branch instruction, the branch link, allows the same jump but stores the current PC
address plus four bytes in the link register.

B 0x8000 0x0400
PC=0x8000
LDR R2,#10
0x8000

B 0x8000 0x0400
PC=0x8000
R14= 0x0400 + 4
LDR R2,#10 0x8000

The branch instruction has several forms. The branch instruction will jump you to a destination address.
The branch link instruction jumps to the destination and stores a return address in R14.So the branch link
instruction is used as a call to a function storing the return address in the link register and the branch instruction can
be used to branch on the contents of the link register to make the return at the end of the function. By using the
condition codes we can perform conditional branching and conditional calling of functions. The branch
instructions have two other variants called “branch exchange” and “branch link exchange”. These two instructions
perform the same branch operation but also swap instruction operation from ARM to THUMB and vice versa.

B : Branch

B<suffix> <address>
B is the simplest branch. Upon encountering a B instruction, the ARM processor will jump immediately
to the address given, and resume execution from there.
Note that the actual value stored in the branch instruction is an offset from the current value of R15; rather than an
absolute address.

Example 4.45:

This example shows a forward and backward branch. Because these loops are address, we do not include
the pre- and post-conditions. The forward branch skips three instructions. The backward branch creates an infinite
loop.
B Forward
ADD R2,R3, #4
ADD RO, R6, #2
ADD R3, R7, #4
Forward:
SUB Rl, R2, #4
Backward:
ADD Rl, R2, #4
SUB Rl, R2, #4
ADD R4, R6, R7
B Backward

Branches are used to change execution flow. Most assemblers hide the details of a branch instruction
encoding by using labels. In this example, forward and backward are the labels. The branch labels are placed at
the beginning of the line and are used to mark an address that can be used later by the assembler to calculate the
branch offset.

BL : Branch with Link

BL<suffix> <address>
BL is another branch instruction. This time, Link register R14 is loaded with the contents of R15 just
before the branch. You can reload R14 into R15 to return to the instruction after the branch - a primitive but
powerful implementation of a subroutine.
The branch with link, or BL, instruction is similar to the B instruction but overwrites the link register LR with a
return address. It performs a subroutine call. This example shows a simple fragment of code that branches to a
subroutine using the BL instruction. To return from a subroutine, you copy the link register to the pc.

BL ADDITION ; Branch to subroutine named ADDITION


CMP R2,#8 ; Compare R2 with 8
MOVEQ R2,#0 ; If (R2= =8) then R2 = 0

ADDTION:
ADD Rl, R2, #4
SUB Rl, R2, #4
ADD R4, R6, R7
MOV PC,LR ; Return from subroutine by moving PC =LR

BX: Branch, and optionally exchange instruction set.

Syntax

BX{cond} Rm

where:
cond is an optional condition code (see Conditional execution).
Rm is an ARM register containing the address to branch to. Bit 0 of Rm is not used as part of the address.If bit
0 of Rm is set, the instruction sets the T flag in the CPSR, and the code at the destination is interpreted as Thumb
code.If bit 0 of Rm is clear, bit 1 must not be set.
The BX instruction is used to branch to a target address stored in a register, based on an optional condition. If bit
0 of the register is set to 1, then the processor will switch to Thumb execution. (Bit 0 is forced to 0 in before the
branch address is stored in the PC.) The sample code below shows a call to a Thumb subroutine

Examples 3.46:
ADR R0, sub ;get subroutine address
ORR R0, #1 ;Bit 0 of R0 is set to 1
MOV LR, PC ;load link register with PC (this address + 8)
BX R0 ;branch to Thumb subroutine

BLX: Branch with Link, and optionally exchange instruction set.

This instruction has two alternative forms:

 an unconditional branch with link to a program-relative address

 a conditional branch with link to an absolute address held in a register.

Syntax

BLX{cond} Rm

BLX label

where:
cond is an optional condition code (see Conditional execution).
Rm is an ARM register containing the address to branch to. Bit 0 of Rm is not used as part of the address.

If bit 0 of Rm is set, the instruction sets the T flag in the CPSR, and the code at the destination is
interpreted as Thumb code. If bit 0 of Rm is clear, bit 1 must not be set.
label

is a program-relative expression. See Register-relative and program-relative expressions for more


information.

Note

BLX label cannot be conditional. BLX label always causes a change to Thumb state.

Usage

The BLX instruction:

 copies the address of the next instruction into R14 (LR, the link register)

 causes a branch to label, or to the address held in Rm

 changes instruction set to Thumb if either:

o bit 0 of Rm is set

o the BLX label form is used.

The machine-level BLX label instruction cannot branch to an address outside ±32Mb of the current instruction.
When necessary, the ARM linker adds code to allow longer branches (see The ARM linker chapter in ADS Linker
and Utilities Guide). The added code is called a veneer.

Examples 3.47:

BLX r2

BLXNE r0

BLX thumbsub

Incorrect example

BLXMI thumbsub ; BLX label cannot be conditional

3.6 Load and Store instructions( Single Register transfer)


The next group of instructions are the data transfer instructions. The ARM7 CPU has load-and-store
register instructions that can move signed and unsigned Word, Half Word and Byte quantities to and from a
selected register.
Mnemonic Meaning
LDR Load Word
LDRH Load Half Word
LDRSH Load Signed Half Word
LDRB Load Byte
LRDSB Load Signed Byte
STR Store Word
STRH Store Half Word
STRSH Store Signed Half Word
STRB Store Byte
STRSB Store Signed Byte
Since the register set is fully orthogonal it is possible to load a 32-bit value into the PC, forcing a program
jump anywhere within the processor address space. If the target address is beyond the range of a branch instruction,
a stored constant can be loaded into the PC.

LDR – Load Register


Syntax:
LDR{<cond>} <Rd>, <addressing_mode>
if(cond)
Rd Memory [memory_address]
if(writeback)
Rn end_address
Flags updated: None
Usage and Examples:
The LDR instruction reads a word from memory and writes it to the destination register. See the section
Load/Store Register Addressing Modes for a description of the available Addressing modes.
Examples 3.48:
LDR R0, [R1] ; R0 = memory [R1]
If the memory address is not word-aligned, the value read is rotated right by 8 times the value of bits
[1:0] of the memory address. If R15 is specified as the destination, the value is loaded from memory and written to
the PC, effecting a branch

Memory Locations

Let R0 = 0x00000002 0A 00
R1 = 0x00000004
0B 01

06 02
09 03

80 04

00 05

88 06

56 07

After LDR R0,[R1]


R0 = 0x56880080
R1 = 0x00000004

LDRH Load Half Word


Syntax:
LDRH {<cond>} <Rd>, <addressing_mode>
If (cond)
Rd [15:0] memory [memory_address], Rd [31:16] 0
If (writeback)
Rn end_address
Flags updated: None

This instruction reads a half word from memory, and zero-extends it to 32-bits in the register. See the
section Miscellaneous Load/Store Addressing Modes for a description of the available addressing modes.

Examples 3.49:
LDRH R0, [R1] ; R0 = zero-extended memory [R1]

Memory Locations

0A 00
0B 01
Let R0 = 0x00000002
06 02
R1 = 0x00000004
09 03

80 04

22 05

Half Word 88 06

After LDRH R0,[R1] 56 07

R0 = 0 x 0000 2280

R1 = 0 x 0000 0004
Zero Extended bits[31:16]

LDRB – Load Register Byte


Syntax:
LDRB{<cond>} <Rd>, <addressing_mode>
:
if(cond)
Rd[7:0] memory[memory_address], Rd[31:8] 0
if(writeback)
Rn end_address
Flags updated: None
The LDRB instruction reads a byte from memory and zero-extends it into the destination register. See the section
Load/Store Register Addressing Modes for a description of the available addressing modes.

Example 3.50:
LDRB R0, [R1] ; R0 = memory [R1] (zero-extended)
Memory Locations

Let R0 = 0x00000002 0A 00
R1 = 0x00000004
0B 01

06 02

09 03

Byte 80 04

22 05
After LDRB R0,[R1]
88 06
R0 = 0x 000000 80 56 07

R1 = 0x 000000 04
Zero Extended bits [31:8]

LDRSB – Load Register Signed Byte


Syntax:
LDRSB{<cond>} <Rd>, <addressing_mode>
: if(cond)
Rd[7:0] memory[memory_address] d[31:8] Rd[7] (sign-extension)
if(writeback)
Rn end_address
Flags updated: None

Example 3.51:
The LDRSB instruction reads a byte from memory, and sign-extends it to 32-bits in the register. See the section
Miscellaneous Load/Store Addressing Modes for a description of the available addressing modes.

LDRSB R0, [R1] ; R0 = sign-extended memory[R1]


Memory Locations

Let R0 = 0x00000002 0A 00
R1 = 0x00000004
0B 01

06 02

09 03
80 04
Byte
22 05
After LDRB R0,[R1]
88 06
R0 = 0x FFFFFF 80 56 07

Sign extended bits [31:8]

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
F F F F F F 8 0

Sign extended bits [31:8] Sign Bit

LDRSH – Load Register Signed Halfword


Syntax:
LDRSH{<cond>} <Rd>, <addressing_mode>
if(cond)
Rd[15:0] memory[memory_address] d[31:16] Rd[15] (sign-extension)
if(writeback)
Rn end_address
Flags updated: None
Example 3.52:
The LDRSH instruction reads a halfword from memory, and sign-extends it to 32-bits in the register. See the
section Miscellaneous Load/Store Addressing Modes for a description of the available addressing modes.

LDRSH R0, [R1] ;R0 = sign-extended memory[R1]


Memory Locations

Let R0 = 0x00000002 0A 00
R1 = 0x00000004
0B 01

06 02

09 03
80 04

22 05
After LDRB R0,[R1]
88 06
R0 = 0x 0000 2280 56 07

Sign Extended bits[31:16]


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0
0 0 0 0 2 2 8 0

Sign extended bits [31:16] Sign Bit[15]

STR – Store Register


Syntax:
STR{<cond>} <Rd>, <addressing_mode>
:
if(cond)
memory[memory_address] Rd
if(writeback)
Rn end_address
Flags updated: None
Example 3.53:
The STR instruction stores a single register to memory. See the section Load/Store Register Addressing Modes
for a description of the available addressing modes.

STR R0, [R1, #4] ; memory[R1+4] = R0, R1 unchanged


STR R0, [R1], -R2, LSL #2 ; memory[R1] = R0, R1 = R1 – (R2 * 4)

STR R0, [R1, #4] ; memory[R1+4] = R0, R1 unchanged

Memory Locations

Let R0 = 0x11223344 0A 00
R1 = 0x00000000
0B 01

06 02

09 03

80 04

00 05

88 06

56 07

After STR R0,[R1,#4] Memory Locations

R0 = 0x11223344 0A 00
R1 = 0x00000000
0B 01

Memory = [R1+4] 06 02
= 0x00000000 + 4 09 03
= 0x00000004
44 04
Word 33 05
R0 = 0x11223344
22 06
R1 = 0x00000000 11 07

STRB – Store Register Byte


Syntax:
STRB{<cond>} <Rd>, <addressing_mode>
if(cond)
memory[memory_address] Rd[7:0]
if(writeback)
Rn end_address
Flags updated: None
Example 3.54:
The STRB instruction stores the least significant byte of a register to memory. See the section Load/Store Register
Addressing Modes for a description of the available addressing modes.
STRB R0, [R1, #4] ; memory[R1+4] = R0, R1= unchanged

Memory Locations

0A 00
Let R0 = 0x11223344
0B 01
R1 = 0x00000000 06 02
09 03
80 04
00 05
88 06
56 07

After STRB R0,[R1,#4] Memory Locations

R0 = 0x11223344 0A 00
R1 = 0x00000000
0B 01

Memory = [R1+4] 06 02
= 0x00000000 + 4 09 03
= 0x00000004
44 04
Byte
00 05
R0 = 0x1122 33 44 LSB Byte
88 06
R1 = 0x0000 0000 56 07

STRH – Store Register Halfword


Syntax:
STRH{<cond>} <Rd>, <addressing_mode>
if(cond)
memory[memory_address] Rd[15:0]
if(writeback)
Rn end_address
Flags updated: None
Example 3.55:
The STRH instruction stores the least significant halfword (2 bytes) of a register to memory. See the section
Miscellaneous Load/Store Addressing Modes for a description of the available addressing modes.

STRH R0, [R1, #2] ; memory[R1 + 2]=R0, R1 unchanged


Memory Locations

Let R0 = 0x11223344 0A 00
R1 = 0x00000000
0B 01

06 02

09 03

80 04

00 05

88 06

56 07
After STRB R0,[R1,#4] Memory Locations

R0 = 0x11223344 0A 00
R1 = 0x00000000
0B 01
Memory = [R1+4] 06 02
= 0x00000000 + 2
09 03
= 0x00000002
80 04

00 05

R0 = 0x1122 3344 LSB 88 06

56 07
R1 = 0x0000 0000

Copying Multiple Registers


In addition to load and storing single register values, the ARM has instructions to load and store multiple
registers. So with a single instruction, the whole register bank or a selected
subset can be copied to memory and restored with a second instruction
The load and store multiple instructions allow you to save or restore the entire register file or
any subset of registers in the one instruction.

LDM – Load Multiple


There are three distinct variants of the LDM instruction. Two of them are for use in conjunction with
exception processing, and are not described here. Further information can be obtained in the ARM Architecture
Reference Manual.
Syntax:
LDM{<cond>}<addressing_mode>, <Rn>{!}, <registers>
:
if(cond)
start_address . Rn
for i = 0 to 14
if(register_list[i] == 1)
Ri . memory[next_address]
if(register_list[15] == 1)
PC . memory[next_address] & 0xFFFFFFFC
if(writeback)
Rn . end_address
Flags updated: None
The LDM instruction permits block moves of memory to the registers and enables efficient stack
operations. The registers may be listed in any order, but the registers are always loaded in order with the lowest
numbered register getting the value form the lowest memory address. If Rn is also listed in the register list and
register writeback (W bit) is set, the final value in Rn is unpredictable.
The addressing_mode field determines how next_address is calculated (bits P & W),
which control how the address is updated in conjunction with each register load. The
four addressing_mode values are;
IA - increment address by 4 after each load (post-increment)
IB - increment address by 4 before each load (pre-increment)
DA - decrement address by 4 after each load (post-decrement)
DB - decrement address by 4 before each load (pre-decrement)

NOTE: The “!” following Rn controls the value of the write back bit (bit W), and signifies that Rn should be
updated with the ending address at the end of the instruction. If the “!” is not present (W=0), the value of Rn
will be unchanged at the end of the instruction.
Example 3.56.
1. LDMIA R7, {R0, R2-R4} ; R0 memory [R7]
; R2 memory [R7+4]
; R3 memory [R7+8]
; R4 memory [R7+12]
; R7 is unchanged
Memory Locations

Let R7 = 0x00000004 0A 00
R0 = 0x00000000 0B 01
06 02
R0 = 0x00000000 09 03
R0 = 0x00000000 80 04
R0 = 0x00000000 00 05
88 06
56 07
11 08
22 09
33 0A
44 0B
55 0C
66 0D
77 0E
00 0F
15 10
32 11
18 12
1B 13

After LDMIA R7, {R0, R2-R4}


Memory Locations

R7 = 0x00000004 80 04[R7]
00 05
R0 = 0x56880080 88 06
56 07
11 08[R7+4]
22 09
R2 = 0x44332211 33 0A
44 0B
55 0C[R7+8]
66 0D
77 0E
R3 = 0x00776655
00 0F
15 10[R7+12]
32 11
R4 = 0x1B183215 18 12
1B 131133
Examples 3.57:
LDMDB R7!, {R0, R2-R4} ; R0 memory [R7-16]
; R2 memory [R7-12]
; R3 memory [R7-8]
; R4 memory [R7-4]
; R7 R7 - 16
Memory Locations

Let R7 = 0x00000013 80 04
R0 = 0x00000000
00 05
R2 = 0x00000000
R3 = 0x00000000 88 06
R4 = 0x00000000 56 07

11 08

22 09

33 0A

44 0B

55 0C

66 0D

77 0E

00 0F

15 10

32 11

18 12

1B 13

After LDMDB R7!, {R0, R2-R4}


Memory Locations

Let 23 00
09 01
R0 = 0x 45000923 00 02
45 03[R7-16] R7 = 0x00000003
80 04
00 05
R2 = 0x 56880080 88 06
56 07[R7-12]
11 08
22 09
R3 = 0x 44332211 33 0A
44 0B
55 0C[R7-8]
66 0D
R4 = 0x 00776655
77 0E
00 0F
15 10[R7-4]
32 11
18 12
1B 13
STM – Store Multiple
There are two distinct variants of the STM instruction. One of them is for use in conjunction with exception
processing, and is not described here. Further information can be obtained in the ARM Architecture Reference
Manual.
Syntax:
STM{<cond>}<addressing_mode>, <Rn>{!}, <registers>
:
if(cond)
start_address . Rn
for i = 0 to 15
if(register_list[i] == 1)
memory[next_address] . Ri
if(writeback)
Rn . end_address
Flags updated: None
The STM instruction permits block moves of registers to memory and enables efficient
stack operations. The registers may be listed in any order, but the registers are always
stored in order with the lowest numbered register going to the lowest memory address.
If Rn is also listed in the register list and register writeback (W bit) is set, the final value
stored for Rn can be unpredictable.
The addressing_mode field determines how next_address is calculated (bits P & W),
which control how the address is updated in conjunction with each register store. The
four addressing_mode values are;
IA - increment address by 4 after each load (post-increment)
IB - increment address by 4 before each load (pre-increment)
DA - decrement address by 4 after each load (post-decrement)
DB - decrement address by 4 before each load (pre-decrement)

NOTE: The “!” following Rn controls the value of the write back bit (bit W), and signifies
that Rn should be updated with the ending address at the end of the instruction. If the “!”
is not present (W=0), the value of Rn will be unchanged at the end of the instruction.

Example 3.58.
STMIA R7, {R0, R2-R4} ; memory [R7] R0
; memory [R7+4] R2
; memory [R7+8] R3
; memory [R7+12] R4
; R7 is unchanged

Memory Locations

Let R7 = 0x00000004 80 04
R0 = 0x000509A6 00 05
R2 = 0x0580C900 88 06
R3 = 0x02030405 56 07
R4 = 0x12131415 11 08
22 09
33 0A
44 0B
55 0C
66 0D
77 0E
00 0F
15 10
32 11
18 12
1B 13
After STMIA R7, {R0, R2-R4}
Memory Locations

Let R7 = 0x00000004 A6 04[R7]


09 05
R0 = 0x 000509A6 05 06
00 07
00 08[R7+4]
C9 09
R2 = 0x 0580C900 80 0A
05 0B
05 0C[R7+8]
04 0D
R3 = 0x 02030405 03 0E
02 0F
15 10[R7+12]
14 11
R4 = 0x 12131415
13 12
12 13

Examples 3.59:
1. STMDB R7!, {R0, R2-R4} ; Memory [R7-4] R0
; Memory [R7-8] R2
; Memory [R7-12] R3
; Memory [R7-16] R4
; R7 [R7 – 16]

Let R7 = 0x00000013 00 03
R0 = 0x000509A6 A6 04
R2 = 0x0580C900 09 05
R3 = 0x02030405 05 06
R4 = 0x12131415 00 07
00 08
C9 09
80 0A
05 0B
05 0C
04 0D
03 0E
02 0F
15 10
14 11
13 12
12 13
After STMDB R7, {R0, R2-R4}
Memory Locations

44 00
80 02
R4 = 0x 15141312 09 02
11 03[R7-16] R7 = 0x00000003
12 04
13 05
R3 = 0x 05040302 14 06
15 07[R7-12]
02 08
03 09
R2 = 0x 00C98005 04 0A
05 0B[R7-8]
05 0C
80 0D
R0 = 0x A6090500
C9 0E
00 0F[R7-4]
00 10
05 11
09 12
A6 13[R7]

Stack operations using STM and LDM instructions.


For use in conjunction with stack addressing, four alternative names can be used for the addressing
modes. These names are based on the type of stack being used instead of the addressing mode being used. This
eliminates confusion in coding stack push and pop operations, since the type of stack will be the same for both the
LDM and STM instructions. In ARM syntax, a full stack is one where the stack pointer points to the last used
(full) location. An empty stack is one where the stack pointer points to the next available (empty) stack location.
As well, a stack can grow through increasing memory addresses (ascending), or downward through decreasing
memory addresses (Descending).
FA (full ascending) - post-decrement (DA) on pop
FD (full descending) - post-increment (IA) on pop
EA (empty ascending) - pre-decrement (DB) on pop
ED (empty descending) - pre-increment on (IB) pop
The instructions below demonstrate a push operation followed by a pop operation assuming an
empty-ascending stack. Note that by including the link register (R14) in the push operation, and the PC in the pop
operation, a subroutine will return to the caller as part of the context save/restore.

NOTE: When using a stack you have to decide whether the stack will grow up or down in memory. A stack is either
ascending(A) or descending(D). Ascending stacks grow towards higher memory addresses; in contrast, descending stacks
grow towards lower memory addresses.
When you use a full stack(F), the stack pointer SP points to an address that is the last used or full location (i.e., SP points to
the last item on the stack). In contrast, if you use an empty stack(E) the SP points to an address that is the first unused or
empty location (i.e., it points after the last item on the stack).

Push operation
Examples 3.60:
STMEA R13!, {R0, R2-R3, LR} ; memory [R13] R0
; memory [R13 +4] R2
; memory [R13+8] R3
; memory [R13 +12] R14
; R13(Sp) R13 + 16
Memory Locations

Let 00 03
R13(Sp) = 0x00000004 A6 04
R4 = 0x000509A6 09 05
R3 = 0x0580C900 05 06
R2 = 0x02030405 00 07
R0 = 0x12131415 00 08
C9 09
80 0A
05 0B
05 0C
04 0D
03 0E
02 0F
15 10
14 11
13 12
12 13

After STMEA R13, {R0, R2-R3,LR}

12 04
13 05
R0 = 0x 15131412 14 06
15 07
02 08[R7+4]
03 09
R2 = 0x 05040302 04 0A
05 0B
05 0C[R7+8]
80 0D
R3 = 0x 00C98005 C9 0E
00 0F
00 10[R7+12]
05 11
LR=R14 = 0x A6090500
09 12
A6 13
Empty 14 Sp = 0x00000014

POP operation
Examples 3.61:
LDMEA R13!, {R0, R2-R3, PC} ; R0 memory [R13 -4]
; R2 memory [R13 -8]
; R3 memory [R13 -12]
; PC memory [R13 -16]
; R13 (Sp) [R13 – 16]
Let R13 = 0x00000014 80 04
R0 = 0x00000000 00 05
88 06
R2 = 0x00000000 56 07
R3 = 0x00000000 11 08
PC = 0x00000000 22 09
33 0A
44 0B
55 0C
66 0D
77 0E
00 0F
15 10
32 11
18 12
1B 13
empty 14

After LDMEA R13!, {R0, R2-R3,PC}


Memory Locations

23 00
PC = 0x 45000923 09 01
00 02
45 03[R13-16] R13(Sp) =0 x00000003
80 04
00 05
R3 = 0x 56880080 88 06
56 07[R13-12]
11 08
22 09
R2 = 0x 44332211 33 0A
44 0B [R13-8]
55 0C
66 0D
R0 = 0x 00776655 77 0E
00 0F[R13-4]
15 10
32 11
18 12
1B 13
Empty 14

3.8 SWP: Swap Instruction


The ARM instruction set also provides support for real time semaphores with a swap instruction. The
swap instruction exchanges a word between registers and memory as one atomic instruction. This prevents crucial
data exchanges from being interrupted by an exception.
The SWP instruction exchanges a word between a register and memory. This instruction is intended to
support semaphore manipulation by completing the transfers as a single, atomic operation.
The swap instruction allows you to exchange the contents of two registers. This takes two cycles but is
treated as a single atomic instruction so the exchange cannot be corrupted by an interrupt.

NOTE: This instruction is not reachable from the C language and is supported by intrinsic
Syntax: functions
withinSWP{<cond>}
the compiler<Rd>,
library.
<Rm>, [<Rn>]
:
if(cond)
temp [Rn]
[Rn] Rm
Rd temp
Flags updated: None

Examples 3.62:
SWP R0, R1, [R2] ; R0 = [R2]
; [R2] = R1
; R1 = unchanged

Let Before Swap


R0=0x00000000
R1=0x11112222
R2=0x00008500 1234567 Memory Location

Now After Swap 8


R0=0x12345678
R1=0x11112222
R2=0x00008500 1111222 Memory Location
2
SWI – Software Interrupt
The Software Interrupt Instruction generates an exception on execution, forces the processor into
supervisor mode and jumps the PC to 0x00000008. As with all other ARM instructions, the SWI instruction
contains the condition execution codes in the top four bits followed by the op code. The remainder of the
instruction is empty. However it is possible to encode a number into these unused bits. On entering the software
interrupt, the software interrupt code can examine these bits and decide which code to run. So it is possible to use
the SWI instruction to make calls into the protected mode, in order to run privileged code or make
operating system calls.
The Software Interrupt Instruction forces the CPU into SUPERVISOR mode and jumps the PC to the SWI
vector. Bits 0-23 are unused and user defined numbers can be encoded into this space.

Syntax:
SWI {<cond>} <immediate_24>
:
If (cond)
R14_svc address of next instruction after SWI instruction
SPSR_svc CPSR ; save current CPSR
CPSR[4:0] 10011b ; supervisor mode
CPSR[5] 0 ; ARM execution
CPSR[7] 1 ; disable interrupts
PC 0x00000008 ; jump to exception vector
Flags updated: N/A

Examples 3.63:
The SWI instruction causes a SWI exception. The processor disables interrupts, switches to ARM
execution if previously in Thumb, and enters supervisory mode. Execution starts at the SWI exception address.
The 24-bit immediate value is ignored by the instruction, but the value in the instruction can be
determined by the exception handler if desired. Parameters can also be passed to the SWI handler in
general-purpose registers or memory locations.
Here we have a simple example of an SWI call with SWI number 0x123456, used by ARM toolkits as a
debugging SWI. Typically the SWI instruction is executed in user mode.
PRE

CPSR = n z c V q i f t_USER

PC = 0x00008000

LR = 0x003FFFFF; LR = Rl4

R0 = 0x12

0x00008600 SWI 0x 123456

POST
CPSR = n z c Vq l f t_SVC
SPSR = n z c Vq I f t _USER
PC = 0x00000008 LR = 0x00008604
R0 = 0x12
Since SWI instructions are used to call operating system routines, you need some form of parameter
passing. This is achieved using registers. In this example, register rO is used to pass the parameter 0x12. The
return values are also passed back via registers.

Loading constants with Pseudo instructions

ADR – Load Address (short-range)


Syntax:
ADR{cond} <Rd>, <label>

Description:
The ADR pseudo-instruction assembles to a single ADD or SUB instruction, normally with the PC as an
operand. This produces position-independent code. The assembler will report an error if it cannot create a valid
instruction to load the address. If the label is program-relative, it must be in the same assembler area as the ADR
instruction. (The ADRL pseudo-instruction can reach a wider address range.)

ADRL – Load Address (medium-range)


Syntax:
ADRL{cond} <Rd>, <label>

Description:
The ADRL pseudo-instruction will always generate a two-instruction sequence to load the address of the
given label into the destination register, giving it a wider target range than the ADR instruction. The code is
position-independent. The assembler will report an error if it cannot create a valid instruction sequence. (The
LDR pseudo-instruction with a label argument can reach any address.)

ASR – Arithmetic Shift Right


Syntax:
ASR{cond}{S} <Rd>, <Rm>, <Rs>
ASR{cond}{S} <Rd>, <Rm>, <#shift_count>

Description:
ASR is a synonym for the MOV instruction with an ASR-shifted register operand. If an immediate shift
count is used, it is limited to the range 1-32. If Rm is not included, the assembler will assume it is the same as Rd.

ASR R0, R1 is equivalent to MOV R0, R0, ASR R1


ASR R0, R1, R2 is equivalent to MOV R0, R1, ASR R2

LDR – Load Register


Syntax:
LDR{cond} <Rd>, =<expression>
LDR{cond} <Rd>, =<label-expression>

Description:
The LDR pseudo-instruction will generate an instruction to load the destination register with the desired
value.
The <expression> field must evaluate to a numeric constant. If the constant is an allowable immediate
expression (or the complement of one), a MOV or MVN instruction will be generated. If it is not, the assembler
will place the value in a memory location, and generate a PC-relative load instruction to load it from that memory
location. If a label is specified, the assembler will generate a local memory location to store the label address, and
include the appropriate linker directives so that the correct address will be in that location after linking.

LSL – Logical Shift Left


Syntax:
LSL{cond}{S} <Rd>, <Rm>, <Rs>
LSL{cond}{S} <Rd>, <Rm>, <#shift_count>
Description:
LSL is a synonym for the MOV instruction with an LSL shifter operand. If an immediate shift count is
used, it is limited to the range 0-31. If Rm is not included, the assembler will assume it is the same as Rd.

LSL R0, R1 is equivalent to MOV R0, R0, LSL R1


LSL R0, R1, R2 is equivalent to MOV R0, R1, LSL R2

LSR – Logical Shift Right


Syntax:
LSR{cond}{S} <Rd>, <Rm>, <Rs>
LSR{cond}{S} <Rd>, <Rm>, <#shift_count>

Description:
LSR is a synonym for the MOV instruction with an LSR shifter operand. If an immediate shift count is
used, it is limited to the range 1-32. If Rm is not included, the assembler will assume it is the same as Rd.

LSR R0, R1 is equivalent to MOV R0, R0, LSR R1


LSR R0, R1, R2 is equivalent to MOV R0, R1, LSR R2

NOP – No Operation
Syntax:
NOP

Description:
There are numerous ways to encode a NOP (no operation) instruction for the ARM7TDMI processor,
such as adding 0 to a register, ORing a register with 0, branching to the next instruction, etc. The actual encoding
of the NOP is assembler-dependent.

POP - Pop
Syntax:
POP {cond} reg_list

Description:
POP is a pseudonym for the LDMIA instruction, with R13! specified for the base register (Rn). The
PUSH/POP instructions assume a full-descending (FD) stack organization.

PUSH - Push
Syntax:
PUSH {cond} reg_list

Description:
PUSH is a pseudonym for the STMDB instruction, with R13! Specified for the base register (Rn). The
PUSH/POP instructions assume a full-descending (FD) stack organization.

ROR – Rotate Right


Syntax:
ROR {cond} {S} <Rd>, <Rm>, <Rs>
ROR {cond} {S} <Rd>, <Rm>, <#shift_count>
Description:
ROR is a synonym for the MOV instruction with an ROR shifter operand. If an immediate shift count is
used, it is limited to the range 1-31. If Rm is not included, the assembler will assume it is the same as Rd.

ROR R0, R1 is equivalent to MOV R0, R0, ROR R1


ROR R0, R1, R2 is equivalent to MOV R0, R1, ROR R2

RRX – Rotate Right with Extend


Syntax:
RRX {cond}{S} <Rd>, <Rm>
Description:
RRX is a synonym for the MOV instruction with an RRX shifter operand. If Rm is not included, the
assembler will assume it is the same as Rd.

RRX R0 is equivalent to MOV R0, R0, RRX


RRX R0, R1 is equivalent to MOV R0, R1, RRX

Programming Examples:

Program 3.1: — 16 bit binary multiplication

1 * Multiplication of Two 16 Bit binary numbers


2
3 TTL 16bitBMUL
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 LDR R0, Number1 ; load First number
9 LDR R1, Number2 ; load Second number
10 MUL R0, R1, R0 ; R0 = R1 x R0
11 STR R0, Result ; Store the result
12
13
14 SWI &11 ; all done
15
16 AREA Data1, DATA
17 Number1
18 DCD &706F ; First 16 bit number
19 Number2
20 DCD &0161 ; Second 16 bit Number
21 ALIGN
22
23 AREA Data2, DATA
24 Result DCD 0 ; Multiplication out put
25 ALIGN
26
27 END

Program 3.2: — Divide a 32 bit binary no by a 16 bit binary no store the quotient and remainder there is no ’DIV’
instruction in ARM!

1 * divide a 32 bit binary no by a 16 bit binary no


2 * store the quotient and remainder
3 * there is no ’DIV’ instruction in ARM!
4
5 TTL Division
6 AREA Program, CODE, READONLY
7 ENTRY
8
9 Main
10 LDR R0, Number1 ; load first number
11 LDR R1, Number2 ; and second
12 MOV R3, #0 ; clear register for quotient
13 Loop
14 CMP R1, #0 ; test for divide by 0
15 BEQ Err
16 CMP R0, R1 ; is the divisor less than the
; dividend?
17 BLT Done ; if so, finished
18 ADD R3, R3, #1 ; add one to quotient
19 SUB R0, R0, R1 ; take away the number you first
; thought of
20 B Loop ; and loop
21 Err
22 MOV R3, #0xFFFFFFFF ; error flag (-1)
23 Done
24 STR R0, Remain ; store the remainder
25 STR R3, Quotient ; and the quotient
26 SWI &11 ; all done
27
28 AREA Data1, DATA
29 Number1 DCD &0075CBB1 ; a 16 bit binary number
30 Number2 DCD &0141 ; another
31 ALIGN
32
33 AREA Data2, DATA
34 Quotient DCD 0 ; storage for result
35 Remain DCD 0 ; storage for remainder
36 ALIGN
37
38 END

Program 3.3: — Add two packed BCD numbers to give a packed BCD result

1 * add two packed BCD numbers to give a packed BCD result


2
3 TTL AddBCD
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Mask EQU 0x0000000F
8
9 Main
10 LDR R0, =Result ; address for storage
11 LDR R1, BCDNum1 ; load the first BCD number
12 LDR R2, BCDNum2 ; and the second
13 LDRB R8, Length ; init counter
14 ADD R0, R0, #3 ; adjust for offset
15 MOV R5, #0 ; carry
16
17 Loop
18 MOV R3, R1 ; copy what is left in the data
; register
19 MOV R4, R2 ; and the other number
20 AND R3, R3, #Mask ; mask out everything except low
; order nibble
21 AND R4, R4, #Mask ; mask out everything except low
; order nibble
22 MOV R1, R1, LSR #4 ; shift the original number one
; nibble
23 MOV R2, R2, LSR #4 ; shift the original number one
; nibble
24 ADD R6, R3, R4 ; add the digits
25 ADD R6, R6, R5 ; and the carry
26 CMP R6, #0xA ; is it over 10?
27 BLT RCarry1 ; if not reset the carry to 0
28 MOV R5, #1 ; otherwise set the carry
29 SUB R6, R6, #0xA ; and subtract 10
30 B Next
31 RCarry1
32 MOV R5, #0 ; carry reset to 0
33
34 Next
35 MOV R3, R1 ; copy what is left in the data
; register
36 MOV R4, R2 ; and the other number
37 AND R3, R3, #Mask ; mask out everything except low
; order nibble
38 AND R4, R4, #Mask ; mask out everything except low
; order nibble
39 MOV R1, R1, LSR #4 ; shift the original number one
; nibble
40 MOV R2, R2, LSR #4 ; shift the original number one
; nibble
41 ADD R7, R3, R4 ; add the digits
42 ADD R7, R7, R5 ; and the carry
43 CMP R7, #0xA ; is it over 10?
44 BLT RCarry2 ; if not reset the carry to 0
45 MOV R5, #1 ; otherwise set the carry
46 SUB R7, R7, #0xA ; and subtract 10
47 B Loopend
48
49 RCarry2
50 MOV R5, #0 ; carry reset to 0
51 Loopend
52 MOV R7, R7, LSL #4 ; shift the second digit processed
; To the left
53 ORR R6, R6, R7 ; and OR in the first digit to the ls
; nibble
54 STRB R6, [R0], #-1 ; store the byte, and decrement
; address
55 SUBS R8, R8, #1 ; decrement loop counter
56 BNE Loop ; loop while > 0
57 SWI &11
58
59 AREA Data1, DATA
60 Length DCB &04
61 ALIGN
62 BCDNum1 DCB &36, &70, &19, &85 ; 1st 8 digit packed BCD number
63
64 AREA Data2, DATA
65 BCDNum2 DCB &12, &66, &34, &59 ; 2nd 8 digit packed BCD number
66
67 AREA Data3, DATA
68 Result DCD 0 ; storage for result
69
70 END

Program 3.4: — Multiply two 32 bit number to give a 64 bit result

1 * multiply two 32 bit number to give a 64 bit result


3
4 TTL Mul32bit
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 LDR R0, Number1 ; load first number
10 LDR R1, Number2 ; and second
11 LDR R6, =Result ; load the address of result
12 MOV R5, R0, LSR #16 ; top half of R0
13 MOV R3, R1, LSR #16 ; top half of R1
14 BIC R0, R0, R5, LSL #16 ; bottom half of R0
15 BIC R1, R1, R3, LSL #16 ; bottom half of R1
16 MUL R2, R0, R1 ; partial result
17 MUL R0, R3, R0 ; partial result
18 MUL R1, R5, R1 ; partial result
19 MUL R3, R5, R3 ; partial result
20 ADDS R0, R1, R0 ; add middle parts
21 ADDCS R3, R3, #&10000 ; add in any carry from above
22 ADDS R2, R2, R0, LSL #16 ; LSB 32 bits
23 ADC R3, R3, R0, LSR #16 ; MSB 32 bits
24
25 STR R2, [R6] ; store LSB
26 ADD R6, R6, #4 ; increment pointer
27 STR R3, [R6] ; store MSB
28 SWI &11 ; all done
29
30 AREA Data1, DATA
31 Number1 DCD &12345678 ; a 16 bit binary number
32 Number2 DCD &ABCDEF01 ; another
33 ALIGN
34
35 AREA Data2, DATA
36 Result DCD 0 ; storage for result
37 ALIGN
38
39 END

Program 3.4: — Sort a list of values - simple bubble sort

1 * sort a list of values - simple bubble sort


2
3 TTL Bobble sort
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 LDR R6, List ; pointer to start of list
9 MOV R0, #0 ; clear register
10 LDRB R0, [R6] ; get the length of list
11 MOV R8, R6 ; make a copy of start of list
12 Sort
13 ADD R7, R6, R0 ; get address of last element
14 MOV R1, #0 ; zero flag for changes
15 ADD R8, R8, #1 ; move 1 byte up the list each
16 Next ; iteration
17 LDRB R2, [R7], #-1 ; load the first byte
18 LDRB R3, [R7] ; and the second
19 CMP R2, R3 ; compare them
20 BCC NoSwitch ; branch if r2 less than r3
21
22 STRB R2, [R7], #1 ; otherwise swap the bytes
23 STRB R3, [R7] ; like this
24 ADD R1, R1, #1 ; flag that changes made
25 SUB R7, R7, #1 ; decrement address to check
26 NoSwitch
27 CMP R7, R8 ; have we checked enough
; bytes?
28 BHI Next ; if not, do inner loop
29 CMP R1, #0 ; did we mke changes
30 BNE Sort ; if so check again - outer loop
31
32 Done SWI &11 ; all done
33
34 AREA Data1, DATA
35 Start DCB 6
36 DCB &2A, &5B, &60, &3F, &D1, &19
37
38 AREA Data2, DATA
39 List DCD Start
40
41 END

Program 3.5: — add a series of 16 bit numbers by using a table address

1 * Add a series of 16 bit numbers by using a table address look-up


2
3 TTL Add16bit
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 LDR R0, =Data1 ; load the address of the lookup table
9 EOR R1, R1, R1 ; clear R1 to store sum
10 LDR R2, Length ; init element count
11 Loop
12 LDR R3, [R0] ; get the data
13 ADD R1, R1, R3 ; add it to r1
14 ADD R0, R0, #+4 ; increment pointer
15 SUBS R2, R2, #0x1 ; decrement count with zero set
16 BNE Loop ; if zero flag is not set, loop
17 STR R1, Result ; otherwise done - store result
18 SWI &11
19
20 AREA Data1, DATA
21
22 Table DCW &2040 ; table of values to be added
23 ALIGN ; 32 bit aligned
24 DCW &1C22
25 ALIGN
26 DCW &0242
27 ALIGN
28 TablEnd DCD 0
29
30 AREA Data2, DATA
31 Length DCW (TablEnd - Table) / 4 ; because we’re having to align
32 ALIGN ; gives the loop count
33 Result DCW 0 ; storage for result
34
35 END

Program 3.6:— Scan a series of 32 bit numbers to find how many are negative

1 * Scan a series of 32 bit numbers to find how many are negative


2
3 TTL Scan negative
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 LDR R0, =Data1 ; load the address of the lookup table
9 EOR R1, R1, R1 ; clear R1 to store count
10 LDR R2, Length ; init element count
11 CMP R2, #0
12 BEQ Done ; if table is empty
13 Loop
14 LDR R3, [R0] ; get the data
15 CMP R3, #0
16 BPL Looptest ; skip next line if +ve or zero
17 ADD R1, R1, #1 ; increment -ve number count
18 Looptest
19 ADD R0, R0, #+4 ; increment pointer
20 SUBS R2, R2, #0x1 ; decrement count with zero set
21 BNE Loop ; if zero flag is not set, loop
22 Done
23 STR R1, Result ; otherwise done - store result
24 SWI &11
25
26 AREA Data1, DATA
27
28 Table DCD &F1522040 ; table of values to be added
29 DCD &7F611C22
30 DCD &80000242
31 TablEnd DCD 0
32
33 AREA Data2, DATA
34 Length DCW (TablEnd - Table) / 4 ; because we’re having to align
86
35 ALIGN ; gives the loop count
36 Result DCW 0 ; storage for result
37
38 END

Program 3.7: — Scan a series of 16 bit numbers to find how many are negative

1 * Scan a series of 16 bit numbers to find how many are negative


2
3 TTL Scan_negative
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 LDR R0, =Data1 ; load the address of the lookup table
9 EOR R1, R1, R1 ; clear R1 to store count
10 LDR R2, Length ; init element count
11 CMP R2, #0
12 BEQ Done ; if table is empty
13 Loop
14 LDR R3, [R0] ; get the data
15 AND R3, R3, #0x8000 ; bit wise AND to see if the 16th
16 CMP R3, #0x8000 ; bit is 1
17 BMI Looptest ; skip next line if zero
18 ADD R1, R1, #1 ; increment -ve number count
19 Looptest
20 ADD R0, R0, #+4 ; increment pointer
21 SUBS R2, R2, #0x1 ; decrement count with zero set
22 BNE Loop ; if zero flag is not set, loop
23 Done
24 STR R1, Result ; otherwise done - store result
25 SWI &11
26
27 AREA Data1, DATA
28
29 Table DCW &F152 ; table of values to be tested
30 ALIGN
31 DCW &7F61
32 ALIGN
33 DCW &8000
34 ALIGN
35 TablEnd DCD 0
36
37 AREA Data2, DATA
38 Length DCW (TablEnd - Table) / 4 ; because we’re having to align
39 ALIGN ; gives the loop count
40 Result DCW 0 ; storage for result
41
42 END

Program 3.8: — Scan a series of 16 bit numbers to find the largest

1 * Scan a series of 16 bit numbers to find the largest


2
3 TTL Largest_16bit
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 LDR R0, =Data1 ; load the address of the lookup
table
9 EOR R1, R1, R1 ; clear R1 to store largest
10 LDR R2, Length ; init element count
11 CMP R2, #0
12 BEQ Done ; if table is empty
13 Loop
14 LDR R3, [R0] ; get the data
15 CMP R3, R1 ; bit is 1
16 BCC Looptest ; skip next line if zero
17 MOV R1, R3 ; increment -ve number count
18 Looptest
19 ADD R0, R0, #+4 ; increment pointer
20 SUBS R2, R2, #0x1 ; decrement count with zero set
21 BNE Loop ; if zero flag is not set, loop
22 Done
23 STR R1, Result ; otherwise done - store result
24 SWI &11
25
26 AREA Data1, DATA
27
28 Table DCW &A152 ; table of values to be tested
29 ALIGN
30 DCW &7F61
31 ALIGN
32 DCW &F123
33 ALIGN
34 DCW &8000
35 ALIGN
36 TablEnd DCD 0
37
38 AREA Data2, DATA
39
40 Length DCW (TablEnd - Table) / 4 ; because we’re having to align
41 ALIGN ; gives the loop count
42 Result DCW 0 ; storage for result
43
44 END

Program 3.9: — Find the length of a string

1 * find the length of a string


2
3 TTL String Length
4
5 CR EQU 0x0D
6
7 AREA Program, CODE, READONLY
8 ENTRY
9
10 Main
11 ADR R0, Data1 ; load the address of the lookup
table
12 EOR R1, R1, R1 ; clear R1 to store count
13 Loop
14 LDRB R2, [R0], #1 ; load the first byte into R2
15 CMP R2, #CR ; is it the terminator
16 BEQ Done ; if not, Loop
17 ADD R1, R1, #1 ; increment count
18 B Loop
19 Done
20 STR R1, CharCount ; otherwise done - store result
21 SWI &11
22
23 AREA Data1, DATA
24
25 Table
26 DCB "Hello, World", CR
27 ALIGN
28
29 AREA Result, DATA
30 CharCount
31 DCB 0 ;storage for count
32 ALIGN
33
34 END
Program 3.10: — Compare two counted strings for equality

1 * compare two counted strings for equality


2
3 TTL Compare_equality
4
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 LDR R0, =Data1 ; load the address of the lookup
table
10 LDR R1, =Data2
11 LDR R2, Match ; assume strings not equal - set
to -1
12 LDR R3, [R0], #4 ; store the first string length in
R3
13 LDR R4, [R1], #4 ; store the second string length
in R4
14 CMP R3, R4
15 BNE Done ; if they are different lengths,
16 ; they can’t be equal
17 CMP R3, #0 ; test for zero length if both are
18 BEQ Same ; zero length, nothing else to do
19
20 * if we got this far, we now need to check the string char by char
21 Loop
22 LDRB R5, [R0], #1 ; character of first string
23 LDRB R6, [R1], #1 ; character of second string
24 CMP R5, R6 ; are they the same
25 BNE Done ; if not the strings are different
26 SUBS R3, R3, #1 ; the string length as a counter
27 BEQ Same ; is the end of the count
28 ; the strings are the same
29 B Loop ; not done, loop
30
31 Same MOV R2, #0 ; clear the -1 from match
; (0 = match)
32 Done STR R2, Match ; store the result
33 SWI &11
34
35 AREA Data1, DATA
36 Table1 DCD 3 ; data table starts with byte
; length of string
37 DCB "CAT" ; the string
38
39 AREA Data2, DATA
40 Table2 DCD 3 ; data table starts with byte
; length of string
41 DCB "CAT" ; the string
42
43 AREA Result, DATA
44 ALIGN
45 Match DCD &FFFF ; storage for parity characters
46
47 END

Program 3.11: — Convert a single hex digit to its ASCII equivalent


1 * convert a single hex digit to its ASCII equivalent
2
3 TTL Hex_ASCII
4
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 LDR R0, Digit ; load the digit
10 LDR R1, =Result ; load the address for the result
11 CMP R0, #0xA ; is the number < 10 decimal
12 BLT Add_0 ; then branch
13
14 ADD R0, R0, #"A"-"0"-0xA ;add offset for ’A’ to ’F’
15 Add_0
16 ADD R0, R0, #"0" ; convert to ASCII
17 STR R0, [R1] ; store the result
18 SWI &11
19
20 AREA Data1, DATA

21 Digit
22 DCD &0C ; the hex digit
23
24 AREA Data2, DATA
25 Result DCD 0 ; storage for result
26
27 END

Program 3.12: — Convert a 32 bit hexadecimal number to an ASCII string and output to the terminal

1 * now something a little more adventurous - convert a 32 bit


2 * hexadecimal number to an ASCII string and output to the terminal
3
4 TTL 32bitHex_ASCII
5
6 AREA Program, CODE, READONLY
7 ENTRY
8 Mask EQU 0x0000000F
9
10 start
11 LDR R1, Digit ; load the digit
12 MOV R4, #8 ; init counter
13 MOV R5, #28 ; control right shift
14 MainLoop
15 MOV R3, R1 ; copy original word
16 MOV R3, R3, LSR R5 ; right shift the correct number of bits
17 SUB R5, R5, #4 ; reduce the bit shift
18 AND R3, R3, #Mask ; mask out all but the ls nibble
19 CMP R3, #0xA ; is the number < 10 decimal
20 BLT Add_0 ; then branch
21
22 ADD R3, R3, #"A"-"0"-0xA ;add offset for ’A’ to ’F’
23
24 Add_0 ADD R3, R3, #"0" ; convert to ASCII
25 MOV R0, R3 ; prepare to output
26 SWI &0 ; output to console
27 SUBS R4, R4, #1 ; decrement counter
28 BNE MainLoop
29
30 MOV R0, #&0D ; add a CR character
31 SWI &0 ; output it
32 SWI &11 ; all done
33
34 AREA Data1, DATA
35 Digit DCD &DEADBEEF ; the hex word
36
37 END

Program 3.13: — Convert a decimal number to seven segment binary

1 * convert a decimal number to seven segment binary


2
3 TTL Sevenseg
4
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 LDR R0, =Data1 ; load the start address of the table
10 EOR R1, R1, R1 ; clear register for the code
11 LDRB R2, Digit ; get the digit to encode
12 CMP R2, #9 ; is it a valid digit?
13 BHI Done ; clear the result
14
15 ADD R0, R0, R2 ; advance the pointer
16 LDRB R1, [R0] ; and get the next byte
17 Done
18 STR R1, Result ; store the result
19 SWI &11 ; all done
20
21 AREA Data1, DATA
22 Table DCB &3F ; the binary conversions table
23 DCB &06
24 DCB &5B
25 DCB &4F
26 DCB &66
27 DCB &6D
28 DCB &7D
29 DCB &07
30 DCB &7F
31 DCB &6F
32 ALIGN
33
34 AREA Data2, DATA
35 Digit DCB &05 ; the number to convert
36 ALIGN
37
38 AREA Data3, DATA
39 Result DCD 0 ; storage for result
40
41 END

Program 3.14: — Convert an ASCII numeric character to decimal

1 * convert an ASCII numeric character to decimal


2
3 TTL ASCII_Dec
4
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 MOV R1, #-1 ; set -1 as error flag
10 LDRB R0, Char ; get the character
11 SUBS R0, R0, #"0" ; convert and check if character is < 0
12 BCC Done ; if so do nothing
13 CMP R0, #9 ; check if character is > 9
14 BHI Done ; if so do nothing
15 MOV R1, R0 ; otherwise....
16 Done
17 STR R1, Result ; store the decimal no
18 SWI &11 ; all done
19
20 AREA Data1, DATA
21 Char DCB &37 ; ASCII representation of 7
22 ALIGN
23
24 AREA Data2, DATA
25 Result DCD 0 ; storage for result
26
27 END

Program 3.15: — Convert an unpacked BCD number to binary

1 * convert an unpacked BCD number to binary


2
3 TTL BCD_Binary
4
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 ADR R0, BCDNum ; load address of BCD number
10 MOV R5, #4 ; init counter
11 MOV R1, #0 ; clear result register
12 MOV R2, #0 ; and final register
13
14 Loop
15 ADD R1, R1, R1 ; multiply by 2
16 MOV R3, R1
17 MOV R3, R3, LSL #2 ; mult by 8 (2 x 4)
18 ADD R1, R1, R3 ; = mult by 10
19 NoMult
20 LDRB R4, [R0], #1 ; load digit and incr address
21 ADD R1, R1, R4 ; add the next digit
22 SUBS R5, R5, #1 ; decr counter
23 BNE Loop ; if counter != 0, loop
24
25 STR R1, Result ; store the result
26 SWI &11 ; all done
27
28 AREA Data1, DATA
29 BCDNum DCB &02,&09,&07,&01 ;an unpacked BCD number
30 ALIGN
31
32 AREA Data2, DATA
33 Result DCD 0 ; storage for result
34
35 END
Program 3.16: — Convert an unpacked BCD number to binary using MUL

1 * convert an unpacked BCD number to binary using MUL


2
3 TTL BCD to Decimal
4
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 ADR R0, BCDNum ; load address of BCD number
10 MOV R5, #4 ; init counter
11 MOV R1, #0 ; clear result register
12 MOV R2, #0 ; and final register
13 MOV R7, #10 ; multiplication constant
14
15 Loop
16 MOV R6, R1
17 MUL R1, R6, R7 ; mult by 10
18 LDRB R4, [R0], #1 ; load digit and incr address
19 ADD R1, R1, R4 ; add the next digit
20 SUBS R5, R5, #1 ; decr counter
21 BNE Loop ; if count! = 0, loop
22
23 STR R1, Result ; store the result
24 SWI &11 ; all done
25
26 AREA Data1, DATA
27 BCDNum DCB &02,&09,&07,&01 ;an unpacked BCD number
28 ALIGN
29
30 AREA Data2, DATA
31 Result DCD 0 ; storage for result
32
33 END

Program 3.17: — Find the length of a null terminated string

1 * find the length of a null terminated string


2
3 TTL Length of a String
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 ADR R0, Data1 ; load the address of the lookup table
9 MOV R1, #-1 ; start count at -1
10 Loop
11 ADD R1, R1, #1 ; increment count
12 LDRB R2, [R0], #1 ; load the first byte into R2
13 CMP R2, #0 ; is it the terminator
14 BNE Loop
15
16 STR R1, CharCount ; otherwise done - store result
17 SWI &11
18
19 AREA Data1, DATA
20
21 Table
22 DCB "Hello, World", 0
23 ALIGN
24
25 AREA Result, DATA
26 CharCount
27 DCB 0 ; storage for count
28 ALIGN
29
30 END

Program 6.8: factorial.s—Lookup the factorial from a table by using the address of the memory location
1 * Lookup the factorial from a table by
2 * using the address of the memory location
3
4 TTL factorial
5 AREA Program, CODE, READONLY
6 ENTRY
7
8 Main
9 LDR R0, =DataTable ; load the address of the lookup table
10 LDR R1, Value ; offset of value to be looked up
11 MOV R1, R1, LSL#0x2 ; data is declared as 32bit - need
12 ; to quadruple the offset to point at the
13 ; correct memory location
14 ADD R0, R0, R1 ; R0contains memory address to store
15 LDR R2, [R0]
16 ADR R3, Result ; the address to store the answer
17 STR R2, [R3] ; store the answer
18
19 SWI &11
20
21 AREA DataTable, DATA
22
23 DCD 1 ; 0! = 1; the data table containing the
factorials
24 DCD 1 ; 1! = 1
25 DCD 2 ; 2! = 2
26 DCD 6 ; 3! = 6
27 DCD 24 ; 4! = 24
28 DCD 120 ; 5! = 120
29 DCD 720 ; 6! = 720
30 DCD 5040 ; 7! = 5040
31 Value DCB 5
32 ALIGN
33 Result DCW 0
34
35

Divide the least significant byte of the 8-bit variable Value into two 4-bit nibbles and store one nibble in each byte
of the 16-bit variable Result. The low-order four bits of the byte will be stored in the low-order four bits of the
least significant byte of Result. The high-order four bits of the byte will be stored in the low-order four bits of the
most significant byte of Result.
Sample Problems
Input: Value = 5F
Output: Result = 050F
Program 6.5: splitbyte.s — Disassemble a byte into its high and low order nibbles
1 * Disassemble a byte into its high and low order nibbles
2
3 TTL split byte
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 LDR R1, Value ; Load the value to be disassembled
9 LDR R2, Mask ; Load the bitmask
10 MOV R3, R1, LSR#0x4 ; Copy just the high order nibble into R3
11 MOV R3, R3, LSL#0x8 ; now left shift it one byte
12 AND R1, R1, R2 ; AND the number with the bitmask
13 ADD R1, R1, R3 ; Add the result of that to
14 ; what we moved into R3
15 STR R1, Result ; Store the result
16 SWI &11
17
18 Value DCB &FB ; Value to be shifted
19 ALIGN ; keep the memory boundaries
20 Mask DCW &000F ; bitmask = %0000000000001111
21 ALIGN
22 Result DCD 0 ; Space to store result
23 END

64-Bit Adition
Add the contents of two 64-bit variables Value1 and Value2. Store the result in Result.
Sample Problems
Input: Value1 = 12A2E640, F2100123
Value2 = 001019BF, 40023F51
Output: Result = 12B30000, 32124074
Program 6.7: 64bitadd.s — 64 bit addition
1 * 64 bit addition
2
3 TTL 64bitadd
4 AREA Program, CODE, READONLY
5 ENTRY
6
7 Main
8 ADR R0, Value1 ; Pointer to first value
9 LDMIA R0, {R1, R2} ; Load the value to be added
10 ADR R0, Value2 ; Pointer to second value
11 LDMIA R0, {R3, R4} ; Load the value to be added
12 ADDS R6, R2, R4 ; Add lower 4 bytes and set carry flag
13 ADC R5, R1, R3 ; Add upper 4 bytes including carry
14 ADR R0, Result ; Pointer to Result
15 STMIA R0, {R5, R6} ; Store the result
16 SWI &5
17 SWI &11
18
19 Value1 DCD &12A2E640, &F2100123 ; Value to be added
20 Value2 DCD &001019BF, &40023F51 ; Value to be added
21 Result DCD 0 ; Space to store result
22 END
Chapter 4:
Introduction to THUMB Instruction Set
Introduction:

 Thumb is:a compressed, 16-bit representation of a subset of the ARM instruction set
– Primarily to increase code density
– also increases performance in some cases
 It is not a complete architecture
 All ‘Thumb-aware’ cores also support the ARM instruction set
– Therefore the Thumb architecture need only support common functions
The Thumb bit

 The ‘T’ bit in the CPSR controls the interpretation of the instruction stream
 switch from ARM to Thumb (and back) by executing BX instruction
 exceptions also cause switch to ARM code
 return symmetrically to ARM or Thumb code
 Note: do not change the T bit with MSR

4.1 Difference between ARM and THUMB Instruction Set


Although the ARM7 is a 32-bit processor, it has a second 16-bit instruction set called THUMB. The THUMB
instruction set is really a compressed form of the ARM instruction set.

Fig. 4.1 Thumb Instruction Processing


The THUMB instruction set is essential for archiving the necessary code density to make small single chip ARM7
micros usable.
This allows instructions to be stored in a 16-bit format, expanded into ARM instructions and then
executed. Although the THUMB instructions will result in lower code performance compared to ARM
instructions, they will achieve a much higher code density. So, in order to build a reasonably-sized application that
will fit on a small single chip microcontroller, it is vital to compile your code as a mixture of ARM and THUMB
functions. This process is called interworking and is easily supported on all ARM compilers. By compiling code in
the THUMB instruction set you can get a space saving of 30%, while the same code compiled
as ARM code will run 40% faster.
The THUMB instruction set is much more like a traditional microcontroller instruction set. Unlike the
ARM instructions THUMB instructions are not conditionally executed (except for conditional branches). The data
processing instructions have a two-address format, where the destination register is one of the source registers:

ARM Instruction THUMB Instruction

ADD R0, R0, R1 ADD R0, R1 R0 = R0+R1

The THUMB instruction set does not contain MSR and MRS instructions, so you can only indirectly
affect the CPSR and SPSR. If you need to modify any user bits in the CPSR you must change to ARM mode. You
can change modes by using the BX and BLX instructions. Also, when you come out of RESET, or enter an
exception mode, you will automatically change to ARM mode.
After Reset the ARM7 will execute ARM (32-bit) instructions. The instruction set can be exchanged at
any time using BX or BLX. If an exception occurs the execution is automatically forced to ARM (32- bit)
The THUMB instruction set has the more traditional PUSH and POP instructions for stack manipulation.
They implement a fully descending stack, hardwired to R13. The THUMB instruction set has dedicated PUSH and
POP instructions which implement a descending stack using R13 as a stack pointer

Fig 4.3 Thumb Push and pop Processing


Finally, the THUMB instruction set does contain a SWI instruction which works in the same way as in the
ARM instruction set, but it only contains 8 unused bits, to give a maximum of 255 SWI calls.
The Thumb instruction set has the same load and store multiple instructions as ARM and in addition, has
a modified version of these instructions in the form of PUSH and POP that implement a full descending stack in
the conventional manner. When the processor is executing Thumb code and an exception occurs, it will switch to
ARM mode in order to process the exception. When the CPSR is restored the, Thumb bit will be reset and the
processor continues to run Thumb instructions

Fig 4.3 Thumb Exception Processing

Table 4.1 ARM and Thumb instruction set features.

Particular ARM Thumb

Instruction size 32-bit 16-bit


Core instructions 58 30
Conditional execution most only branch instructions
Data processing access to barrel shifter and separate barrel shifter and
instructions ALU ALU instructions
Program status register read-write in privileged mode no direct access
Register usage 15 general-purpose registers 8 general-purpose registers
+pc +7 high registers +pc
Addressing formats most instructions use a 3-address format most instructions use a 2-address
format
Shift opcodes implements shifts as operand modifiers explicit shift opcodes
4.2 Register usage in THUMB.
The THUMB instruction set does not have full access to all registers in the register file. All data
processing instructions have access to R0 –R7 (these are called the “low registers”.)

Fig.4.2 Thumb programmers model


In the THUMB programmers’ model all instructions have access to R0-R7. Only a few instructions may
access R8-R12.However access to R8-R12 (the “high registers”) is restricted to a few instructions: MOV, ADD,
CMP

4.3 ARM –THUMB interworking with BX and BLX instruction

One of the most important features of the ARM9 CPU is its ability to run 16 bit THUMB code and 32 bit
ARM code. In order to get a reasonably complex application to fit into the on-chip FLASH memory, it is very
important to interwork these two instruction sets so that most of the application code is encoded in the THUMB
instruction set and is effectively compressed to take minimal space in the on-chip FLASH memory. Any time
critical routines where the full processing power of the ARM9 is required need to be encoded in the ARM 32 bit
instruction set.When generating the code, the compiler must be enabled to allow interworking. This is achieved
with the following switch:
m-THUMB-interwork
The GCC compiler is designed to compile a given C module in either the THUMB or ARM instruction
set. Therefore you must lay out your source code so that each module only contains functions that will be encoded
as ARM or THUMB functions. By default the compiler will encode all source code in the ARM instruction set. To
force a module to be encoded in the THUMB instruction set, use the following directive when you compile the
code:
-mTHUMB
The linker can then take both ARM and THUMB object files and produce a final executable that
interworks both instruction sets.
• T (Thumb)-extension shrinks the ARM instruction set to 16-bit word length -> 35-40% saving in
amount of memory compared to 32-bit instruction set
• Extension enables simpler and significantly cheaper realization of processor system. Instructions take
only half of memory than with 32-bit instruction set without significant decrease in performance or
increase in code size.
• Extension is made to instruction decoder at the processor pipeline
• Registers are preserved as 32-bit but only half of them are

Thumb extension
• Thumb-instruction decoder is placed in pipeline
• Change to Thumb-mode happens by turning the state of multiplexers feeding the instruction decoders
and data bus
• A1 selects the 16-bit half word from the 32-bit bus

Example of instruction conversion


• Thumb-instruction ADD Rd, #constant is converted to unconditionally executed ARM-instruction ADD
Rd, Rn, #constant
• Only the lower register set is in use so the upper register bit is fixed to zero and source and destination are
equal. The constant is also 8-bit instead of 12-bit available in ARM-mode
Changing the mode
• Set T-flag in CPSR register and execute BX (Branch eXchange) to the address the thumb code begins at
• Same memory space and contain mixed native ARM-code and Thumb-code
• Execution speed of 32-bit ARM-code decreases significantly if system uses only 16-bit data bus
• If native ARM-code is used, typically it is contained in separate ROM-area as a part of ASIC (ASSP)
chip
• Return to Thumb code from native ARM-code can be made by resetting the T-flag and executing BX to
desired address

THUMB instruction set


The following table summarizes the THUMB instruction set.

Thumb Instruction Description ARM Equivalent Instruction


ADC Rd, Rs Add with Carry ADCS Rd, Rd, Rs
ADD Rd, Rs, Rn Add ADDS Rd, Rs, Rn
AND Rd, Rs Logical AND ANDS Rd, Rd, Rs
ASR Rd, Rs Arithmetically Shift Right MOVS Rd, Rd, ASR Rs
B label Unconditional Branch BAL label (half word offset)
BXX Conditional Branch
BIC Rd, Rs Bit Clear BICS Rd, Rd, Rs
BL label Branch and Link none
BX Rs Branch and Exchange BX Rs
CMN Rd, Rs Compare Negative CMN Rd, Rs
CMP Rd, #Offset8 Compare CMP Rd, #Offset8
CMP Rd, Rs CMP Rd, Rs
CMN Rd, Rs CMN Rd, Rs
EOR Rd, Rs EOR EORS Rd, Rd, Rs
LDMIA Rb!, { Rlist } Load multiple LDMIA Rb!, { Rlist }
LDR Rd, [Rb, Ro] Load word LDR Rd, [Rb, Ro]
LDRB Rd, [Rb, Ro] Load byte LDRB Rd, [Rb, Ro]
LDRH Rd, [Rb, Ro] Load half word LDRH Rd, [Rb, Ro]
LSL Rd, Rs Logical Shift Left MOVS Rd, Rd, LSL Rs
LDSB Rd, [Rb, Ro] Load sign-extended byte LDRSB Rd, [Rb, Ro]
LDSH Rd, [Rb, Ro] Load sign-extended half word LDRSH Rd, [Rb, Ro]
LSR Rd, Rs Logical Shift Right MOVS Rd, Rd, LSR Rs
MOV Rd, #Offset8 Move register MOVS Rd, #Offset8
MUL Rd, Rs Multiply MULS Rd, Rs, Rd
MVN Rd, Rs Move Negative register MVNS Rd, Rs
NEG Rd, Rs Negate RSBS Rd, Rs, #0
ORR Rd, Rs Logical OR ORRS Rd, Rd, Rs
POP { Rlist } Pop registers LDMIA R13!, { Rlist }
POP { Rlist, PC } LDMIA R13!, { Rlist, R15 }
PUSH { Rlist } Push registers STMDB R13!, { Rlist }
PUSH { Rlist, LR } STMDB R13!, { Rlist, R14 }
ROR Rd, Rs Rotate Right MOVS Rd, Rd, ROR Rs
SBC Rd, Rs Subtract with Carry SBCS Rd, Rd, Rs
STMIA Rb!, { Rlist } Store Multiple STMIA Rb!, { Rlist }
STR Rd, [Rb, Ro] Store word STR Rd, [Rb, Ro]
STRB Rd, [Rb, Ro] Store byte STRB Rd, [Rb, Ro]
STRH Rd, [Rb, Ro] Store half word STRH Rd, [Rb, Ro]
SWI Value8 Software Interrupt SWI Value8
SUB Rd, #Offset8 Subtract SUBS Rd, Rd, #Offset8
SUB Rd, Rs, Rn Subtract SUBS Rd, Rs, Rn
TST Rd, Rs Test bits TST Rd, Rs

Data processing instructions:


The data processing instruction works on data within the registers. They include arithmetic instructions, shift
instructions, logical instruction, move instructions, multiply instructions and comparison instructions. Thumb
data processing instructions are the subset of ARM data processing instructions.

Arithmetic instructions:

These include Addition, Subtraction and multiply instructions.

Addition instructions.
This group consists of the following instructions.
1. ADC Rd, Rs ; Rd= Rd + Rs + Carry bit
This instruction adds the content of Source register Rd and destination register Rs along with
carry bit and stores the result in destination register Rd.
2. ADD Rd, Rs, Rn ; Rd= Rn + Rs
This instruction adds the content of Source register Rd and destination register Rs and stores
the result in destination register Rd.
3. ADD Rd, Rn, #immediate ; Rd= Rn +#immediate value
This instruction adds the content of Source register Rd and destination register Rs and stores
the result in destination register Rd. immediate range is between 0-7.

4. ADD Rd, Rn ;Rd := Rd + Rm


This instruction adds the content of Source register Rd and destination register Rn and stores the
result in destination register Rd. Not Lo to Lo. Flags not affected.
5. ADD Rd, #<immed> ; Rd := Rd + immed
This instruction adds the content of Source register Rd and destination register Rn and stores the
result in destination register Rd. Flags affected are N Z C V. Immediate range 0-255.
6. ADD SP, #<immed> ; R13 := R13 + immed
This instruction adds the content of Stack pointer (R13) and immediate value and stores the
result in destination SP (R13). Flags are not affected. Immediate range 0-508.
7. ADD Rd, SP, #<immed> ; Rd := R13 + immed
This instruction adds the content of Stack pointer (R13) and immediate value and stores the
result in destination register Rd. Flags are not affected. Immediate range 0-1020 (word-aligned).
8. ADD Rd, PC, #<immed> ; Rd := (R15 AND 0xFFFFFFFC) + immed
This instruction adds the content of PC (R15) and immediate value and stores the result in
destination register Rd. Flags are not affected. Immediate range 0-1020 (word-aligned).Flags not
affected.

Subtraction instructions.
This group consists of the following instructions.
1. SUB Rd, Rn, Rm ; Rd := Rn – Rm
This instruction subtracts the content of Rm from register Rn and stores the result in destination
register Rd. Flags affected are N Z C V. Immediate range 0-7.
2. SUB Rd, Rn, #<immed> ; Rd: = Rn – immed
This instruction subtracts the immediate value from register Rn and stores the result in
destination register Rd. Flags affected are N Z C V. Immediate range 0-7.
3. SUB Rd, #<immed> ; Rd: = Rd – immed
This instruction subtracts the immediate value from register Rd and stores the result in
destination register Rd. Flags affected are N Z C V. Immediate range 0-255.
4. SBC Rd, Rm N Z C V ; Rd: = Rd – Rm – NOT C-bit
This instruction subtracts the content of register Rm and complemented value of carry Bit
(Borrow) from register Rd and stores the result in destination register Rd. Flags affected are N Z C V.
5. SUB SP, #<immed> ; R13:= R13 – immed
This instruction subtracts the immediate value from the content of Stack pointer (SP) register
R13 and stores the result in SP (R13). Immediate range 0-508 (word-aligned). Flags not affected.
6. NEG Rd, Rm N Z C V ; Rd: = – Rm
This instruction performs the two’s complement of the content of Source register Rm. and stores
the result in Rd. Flags affected are N Z C V.

Multiplication instruction.
This group consists of the following instructions.

1. MUL Rd, Rm ; Rd: = Rm * Rd


This instruction performs the multiply the content of Rd and Rm and stores the result in Rd.
Flags affected are N Z .C V are unpredictable.

Compare instructions.
This group consists of the following instructions.

1. CMP Rn, Rm ; update CPSR flags on Rn – Rm


This instruction Subtracts (compares) the content of Rn am Rm and without altering the content
of Rn and Rm. Flags affected are N Z.C V
2. CMN Rn, Rm ; update CPSR flags on Rn –(- Rm)
This instruction Subtracts (compares) the content of Rn am negative value of Rm and without
altering the content of Rn and Rd. Flags affected are N Z.C V
3. CMP Rn, #<immed> ; update CPSR flags on Rn – immed
This instruction Subtracts (compares) the content of Rn and immediate value, without altering
the content of Rn. Flags affected are N Z.C V. Immediate range 0-255.
9. NOP ; No operation None Flags not affected
This instruction performs No operation. Flags affected are not affected.

Logical instructions:

These include AND, OR, NOT, Exclusive -OR and Bit Clear instruction.

1. AND Rd, Rm ; Rd: = Rd AND Rm


This instruction performs the logical AND operation between the content of Source register Rm
and destination register Rd and stores the result in destination register Rd. Flags affected are N and Z.
2. EOR Rd, Rm ; Rd: = Rd XOR Rm
This instruction performs the logical X-OR operation between the content of Source register
Rm and destination register Rd and stores the result in destination register Rd. Flags affected are N and Z.
3. ORR Rd, Rm ; Rd: = Rd OR Rm
This instruction performs the logical OR operation between the content of Source register Rm
and destination register Rd and stores the result in destination register Rd. Flags affected are N and Z.
4. BIC Rd, Rm ; Rd: = Rd AND NOT Rm
This instruction performs the logical AND operation between the complemented value of
Source register Rm and destination register Rd and stores the result in destination register Rd. Flags
affected are N and Z.
5. MVN Rd, Rm ; Rd: = NOT Rm
This instruction moves the complemented value of Source register Rm to destination register Rd
Flags affected are N and Z.
6. TST Rn, Rm ; update CPSR flags on Rn AND Rm
This instruction performs logical AND operation between Source register Rm and destination
register Rd, without changing their content. Flags affected are N and Z.
Shift/Rotate instructions:

These include Logical shift left, logical shift right, arithmetic shift right and rotate instruction.
1. LSL Rd, Rm, #<shift> ; Rd := Rm << #shift
This instruction performs logical shift left operation by immediate shift value on the content of
Rm and stores the result in destination register Rd. Flags affected are N Z and C. Immediate shift value is
between 0-31. C flag unaffected if shift is 0.
2. LSL Rd, Rs ; Rd := Rd << Rs[7:0]
This instruction performs logical shift left operation by the shift value stored in the register Rm
and stores the result in destination register Rd. Flags affected are N Z and C.
C flag unaffected if shift is 0.
3. LSR Rd, Rm, #<shift> N Z C ; Rd := Rm >> shift
This instruction performs logical shift right operation by the immediate shift value
and stores the result in destination register Rd. Flags affected are N Z and C. Immediate shift value is
between 1-32.
4. LSR Rd, Rs ; Rd := Rd >> Rs[7:0]
This instruction performs logical shift right operation by the shift value stored in the register Rs
and stores the result in destination register Rd. Flags affected are N Z and C. C flag unaffected if shift is
0.
5. ASR Rd, Rm, #<shift> ; Rd := Rm ASR #immediate value
This instruction performs Arithmetic shift right operation by the immediate shift value on the
content of register Rm and stores the result in destination register Rd. Flags affected are N Z and C.
Immediate shift value is between 1-32.
6. ASR Rd, Rs ; Rd := Rd ASR shift Rs[7:0]
This instruction performs Arithmetic shift right operation by the shift value stored in the register
Rs on the content of register Rd and stores the result in destination register Rd. Flags affected are N Z and
C. C flag unaffected if shift is 0
7. ROR Rd, Rs ; Rd := Rd ROR Rs[7:0]
This instruction performs Rotate the content of register Rd towards right by the value stored in
Rs and stores the result in destination register Rd. Flags affected are N Z and C. C flag unaffected if rotate
is 0.

Move instructions:

1. MOV Rd, #<immed> ; Rd := immed


This instruction moves (copy) the immediate value to destination register Rd. Flags affected are
N Z .Immediate value is in the range of 0-255
2. MOV Rd, Rm ; Rd := Rm
This instruction moves (copy) the content of Register Rm to destination register Rd. Hi
(R8-R15) to Lo (R0-R7), Lo (R0-R7) to Hi (R8-R15), Hi (R8-R15) to Hi (R8-R15) is allowed but Lo
(R0-R7) to Lo (R0-R7) is not allowed. Flags not affected.
For example;
MOV Rd, Hs; Move a value from a register in the range 8-15 to a register in the range 0-7
MOV Hd, Rs; Move a value from a register in the range 0-7 to a register in the range 8-15.
MOV Hd, Hs; Move a value between two registers in the range 8-15.
3. MOV Rd, Rm ; Rd := Rm
This instruction moves (copy) the content of Register Rm to destination register Rd. Flags
affected are N Z. Clears C and V flags. Only Lo(R0-R7) to Lo(R0-R7) is allowed.
4. CPY Rd, Rm ; Rd := Rm
This instruction moves (copy) the content of Register Rm to destination register Rd. Any
register to any register. Flags not affected.

Load instructions:

1. LDR Rd, [Rn, #<immed>] ; Rd := [Rn + immed]


This instruction loads the destination register Rd with content of word memory location
addressed by the sum of content of Rn and immediate value. Immediate data range is between 0-124,
multiple of 4.
2. LDRH Rd, [Rn, #<immed>] ; Rd := Zero Extend([Rn + immed][15:0])
This instruction loads the destination register Rd (15:0) location with 16 bit ( Half
Word) data from memory location addressed by the sum of content of Rn and immediate value and clears
the bits 16 to 31. Immediate data range is between 0-62 and is even data.
3. LDRB Rd, [Rn, #<immed>] ; Rd := Zero Extend([Rn + immed][7:0])
This instruction loads the destination register Rd (7:0) location with 8 bit (Byte) data from
memory location addressed by the sum of content of Rn and immediate value and clears the bits 8 to 31.
Immediate data range is between 0-31.
4. LDR Rd, [Rn, Rm] ; Rd := [Rn + Rm]
This instruction loads the destination register Rd (31:0) location with 32 bit (Word) data from
memory location addressed by the sum of content of Rn and Rm.
5. LDRH Rd, [Rn, Rm] ; Rd := Zero Extend([Rn + Rm][15:0])
This instruction loads the destination register Rd (15:0) location with 16 bit (Half
Word) data from memory location addressed by the sum of content of Rn and Rm and clears the bits 16 to
31.
6. LDRSH Rd, [Rn, Rm] ; Rd := Sign Extend([Rn + Rm][15:0])
This instruction loads the destination register Rd (15:0) location with 16 bit ( Half
Word) data from memory location addressed by the sum of content of Rn and Rm and loads the sign
bit(15) value to the bits 16 to 31.
7. LDRB Rd, [Rn, Rm] ; Rd := Zero Extend([Rn + Rm][7:0])
This instruction loads the destination register Rd (7:0) location with 8 bit (Byte) data from
memory location addressed by the sum of content of Rn and Rm and clears the bits 8 to 31.
8. LDRSB Rd, [Rn, Rm] ;Rd := Sign Extend([Rn + Rm][7:0])
This instruction loads the destination register Rd (7:0) location with 8 bit (Byte) data from
memory location addressed by the sum of content of Rn and Rm loads the sign bit (7) value to the bits 8
to 31
9. LDR Rd, [PC, #<immed>] ;Rd := [(R15 AND 0xFFFFFFFC) + immed]
This instruction loads the destination register Rd with word data from memory location
addressed by the sum of content of R15 and immediate value. Immediate value is in the range of 0-1020,
should be multiple of 4
10. LDR Rd, [SP, #<immed>] ; Rd := [R13 + immed]
This instruction loads the destination register Rd with word data from memory location
addressed by the sum of content of R13 and immediate value. Immediate value is in the range of 0-1020,
should be multiple of 4
11. LDMIA Rn!, <reglist> ; Loads list of registers
This instruction loads the registers in the reglist with word data from memory location
addressed by the content of Rn and increment the address for each word. Content of Rn value is updated
after each word writing.

Store instructions:

1. STR Rd, [Rn, #<immed>] ; [Rn + immed] := Rd


This instruction stores the content of destination register Rd into the memory location addressed
by the sum of content of Rn and immediate value. Immediate data range is between 0-124, should be
multiple of 4.
2. STRH Rd, [Rn, #<immed>] ; [Rn + immed][15:0] := Rd[15:0]
This instruction stores the half word data from the lower 16 bits of destination register Rd (15:0)
into the memory location addressed by the sum of content of Rn and immediate value. Immediate data
range is between 0-62, should be even.
3. STRB Rd, [Rn, #<immed>] ; [Rn + immed][7:0] := Rd[7:0]
This instruction stores the byte data from the lower 8 bits of destination register Rd (7:0) into the
memory location addressed by the sum of content of Rn and immediate value. Immediate data range is
between 0-31, and ignores bits 8to 31 of Rd.
4. STR Rd, [Rn, Rm] ; [Rn + Rm] := Rd
This instruction stores the word data from the destination register Rd into the memory location
addressed by the sum of content of Rn and Rm.
5. STRH Rd, [Rn, Rm] ; [Rn + Rm][15:0] := Rd[15:0]
This instruction stores the half word data from the lower 16 bits of destination register Rd (15:0)
into the memory location addressed by the sum of content of Rn and Rm, and ignores the bit 16 to 31 of
Rd.
6. STRB Rd, [Rn, Rm] ; [Rn + Rm][7:0] := Rd[7:0]
This instruction stores the byte data from the lower 8 bits of destination register Rd (7:0) into the
memory location addressed by the sum of content of Rn and Rm, and ignores the bit 8 to 31 of Rd.
7. STR Rd, [SP, #<immed>] ; [R13 + immed] := Rd
This instruction stores the word data from the destination register Rd into the memory location
addressed by the sum of content of R13 and immediate Value, and immediate value is in the range of
0-1020, multiple of 4.
8. STMIA Rn!, <reglist> ; Stores list of registers
This instruction Stores the word data into memory location addressed by the content of Rn from
the content of registers in the register list and increment the address for each word. Content of Rn value is
updated after each word writing.

Push/Pop Instructions

1. PUSH <loreglist> ; Push registers onto stack


This instruction Stores (PUSH) the content of Specified registers in the list on to stack and
update the stack pointer.
Example.
PUSH {R0-R4, LR} ; Store R0, R1, R2, R3, R4 and R14 (LR) at the
stack memory pointed to by R13 (SP) and update R13 (SP).
2. POP <loreglist> ; Pop registers from stack
This instruction Loads (POP) the Specified registers in the list from the stack and update the
stack pointer.
Example.
POP {R0-R4, LR} ; Loads R0, R1, R2, R3, R4 and R14 (LR) from
the stack memory pointed to by R13 (SP) and update R13(SP).

Branching instructions.

1. B {cond} label ; R15:= label


This instruction loads R15 with the address of label and jumps to the location pointed by R15 if
the condition in the condition field is true. label must be within – 252 to + 258 bytes of current
instruction. See Table Condition Field
2. B label ; R15:= label
This instruction loads R15 with the address of label and jumps to the location pointed by R15
Unconditionally. label must be within ±2Kb of current instruction.

3. BL label ; R14:= address of next instruction


This instruction loads R15 with the address of label and jumps to the location pointed by R15
Unconditionally. label must be within ±4Kb of current instruction.

4. BX Rm ; R15:= Rm AND 0xFFFFFFFE


Branch and exchange Change to ARM state if Rm [0] = 0.This instruction loads R15 with the
content of Rm and changes to ARM State if the content of Bit 0 of rm is 0, jumps to the location pointed
by R15 Unconditionally. label must be within ±4Kb of current instruction.

5. BLX label ; R14:= address of next instruction


Branch with Link and exchange Change to ARM state if Rm [0] = 0.This instruction loads R15
with the address of label and changes to ARM State if the content of Bit 0 of Rm is 0, jumps to the
location pointed by R15 Unconditionally. label must be within ±4Kb of current instruction.
.
Extend instructions

1. SXTH Rd, Rm ; Rd [31:0]:= Sign Extend (Rm [15:0])


Signed extend word to word This instruction loads Bits 0 to 15 of Rd with the content of bits 0 to
15 of Rm and loads bits 16 to 32 of Rd with value of Sign Bit (15) of Rm.
2. SXTB Rd, Rm ; Rd [31:0]:= Sign Extend (Rm [7:0])
Signed extend byte to word This instruction loads Bits 0 to 7 of Rd with the content of bits 0 to
7 of Rm and loads bits 8 to 32 of Rd with value of Sign Bit (7) of Rm.
3. UXTH Rd, Rm ; Rd [31:0]:= Zero Extend (Rm [15:0])
Unsigned extend word to word This instruction loads Bits 0 to 15 of Rd with the content of bits
0 to 15 of Rm and loads bits 16 to 32 of Rd with value 0.
4. UXTB Rd, Rm ; Rd [31:0]:= Zero Extend (Rm [7:0])
Unsigned extend byte to word This instruction loads Bits 0 to 7 of Rd with the content of bits 0
to 7 of Rm and loads bits 8 to 32 of Rd with value 0.
Chapter 5
ARM PROGRAMMING
This chapter discusses the functions performed by assemblers, beginning with features common to most
assemblers and proceeding through more elaborate capabilities such as macros and conditional assembly. You
may wish to skim this chapter for the present and return to it when you feel more comfortable with the material.
As we mentioned, today’s assemblers do much more than translate assembly language mnemonics into binary
codes. But we will describe how an assembler handles the translation of mnemonics before describing additional
assembler features. Finally we will explain how assemblers are used.

5.1 Fields:
Assembly language instructions (or “statements”) are divided into a number of “fields”. The operation
code field is the only field which can never he empty; it always contains either an instruction mnemonic or a
directive to the assembler, sometimes called a “pseudo-instruction,” “pseudo-operation,” or “pseudo-op.”The
operand or address field may contain an address or data, or it may be blank. The comment and label fields are
optional. A programmer will assign a label to a statement or add a comment as a personal convenience: namely, to
make the program easier to read and use. Of course, the assembler must have some way of telling where one field
ends and another begins. Assemblers often require that each field start in a specific column. This is a “fixed
format.” However, fixed formats are inconvenient when the input medium is paper tape; fixed formats are also a
nuisance to programmers. The alternative is a “free format” where the fields may appear anywhere on the line.

5.1.1 Delimiters:
If the assembler cannot use the position on the line to tell the fields apart, it must use something else.
Most assemblers use a special symbol or “delimiter” at the beginning or end of each field.

Label Operation Code Operand or Comment Field


Field or Mnemonic Field Address Field

VALUE1 DCW 0x201E ; FIRST VALUE


VALUE2 DCW 0x0774 ; SECOND VALUE
RESULT DCW 1 ; 16-BIT STORAGE FOR
ADDITION RESULT

START MOV R0, VALUE1 ; GET FIRST VALUE


ADD R0, R0, VALUE2 ; ADD SECOND VALUE TO
FIRST VALUE
STR RESULT, R0 ; STORE RESULT OF ADDITION
NEXT: ? ? ; NEXT INSTRUCTION

label (whitespace) instruction (whitespace); comment

Whitespace Between label and operation code, between operation code and address,and before an
entry in the comment field
Comma Between operands in the address field
Asterisk Before an entire line of comment
Semicolon Marks the start of a comment on a line that contains preceding code

Table 2.1: Standard ARM Assembler Delimiters

The most common delimiter is the space character. Commas, periods, semicolons, colons, slashes,
question marks, and other characters that would not otherwise be used in assembly language programs also may
serve as delimiters. The general form of layout for the ARM assembler is: You will have to exercise a little care
with delimiters. Some assemblers are fussy about extra spaces or the appearance of delimiters in comments or
labels. A well-written assembler will handle these minor problems, but many assemblers are not well-written. Our
recommendation is simple: avoid potential problems if you can. The following rules will help:
• Do not use extra spaces, in particular, do not put spaces after commas that separate
operands, even though the ARM assembler allows you to do this.
• Do not use delimiter characters in names or labels.
• Include standard delimiters even if your assembler does not require them. Then it will be
more likely that your programs are in correct form for another assembler.

5.1.2 Labels:
The label field is the first field in an assembly language instruction; it may be blank. If a label is present,
the assembler defines the label as equivalent to the address into which the first byte of the object code generated
for that instruction will be loaded. You may subsequently use the label as an address or as data in another
instruction’s address field. The assembler will replace the label with the assigned value when creating an object
program.
The ARM assembler requires labels to start at the first character of a line. However, some other
assemblers also allow you to have the label start anywhere along a line, in which case you must use a colon (:) as
the delimiter to terminate the label field. Colon delimiters are not used by the ARM assembler. Labels are most
frequently used in Branch or SWI instructions. These instructions place a new value in the program counter and so
alter the normal sequential execution of instructions. B 15016 means “place the value 15016 in the program
counter.” The next instruction to be executed will be the one in memory location 15016. The instruction B
START means “place the value assigned to the label START in the program counter.” The next instruction to be
executed will be the on at the address corresponding to the label START. Figure 2.1 contains an example.
Why use a label? Here are some reasons:

• A label makes a program location easier to find and remember.


• The label can easily be moved, if required, to change or correct a program. The assembler
will automatically change all instructions that use the label when the program is reassembled.

Assembly language Program


START: MOV R0, VALUE1
.
. (Main Program)
.
BAL START

Figure 2.1: Assigning and Using a Label

When the machine language version of this program is executed, the instruction BAL START causes the
address of the instruction labeled START to be placed in the program counter That instruction will then be
executed.
• The assembler can relocate the whole program by adding a constant (a “relocation constant”) to
each address in which a label was used. Thus we can move the program to allow for the insertion of
other programs or simply to rearrange memory.
• The program is easier to use as a library program; that is, it is easier for someone else to take your
program and add it to some totally different program.
• You do not have to figure out memory addresses. Figuring out memory addresses is particularly
difficult with microprocessors which have instructions that vary in length.
You should assign a label to any instruction that you might want to refer to later.
The next question is how to choose a label. The assembler often places some restrictions on the number
of characters (usually 5 or 6), the leading character (often must be a letter), and the trailing characters (often must
be letters, numbers, or one of a few special characters). Beyond these restrictions, the choice is up to you.
Our own preference is to use labels that suggest their purpose, i.e., mnemonic labels. Typical examples
are ADDW in a routine that adds one word into a sum, SRCHETX in a routine that searches for the ASCII
character ETX, or NKEYS for a location in data memory that contains the number of key entries. Meaningful
labels are easier to remember and contribute to program documentation. Some programmers use a standard
format for labels, such as starting with L0000. These labels are self-sequencing (you can skip a few numbers to
permit insertions), but they do not help document the program.
Some label selection rules will keep you out of trouble. We recommend the following:
• Do not use labels that are the same as operation codes or other mnemonics. Most assemblers will not
allow this usage; others will, but it is confusing.
• Do not use labels that are longer than the assembler recognises. Assemblers have various rules, and
often ignore some of the characters at the end of a long label.
• Avoid special characters (non-alphabetic and non-numeric) and lower-case letters. Some assemblers
will not permit them; others allow only certain ones. The simplest practice is to stick to capital letters
and numbers.
• Start each label with a letter. Such labels are always acceptable.
• Do not use labels that could be confused with each other. Avoid the letters I, O, and Z and the numbers
0, 1, and 2. Also avoid things like XXXX and XXXXX. Assembly programming is difficult enough
without tempting fate or Murphy’s Law.
• When you are not sure if a label is legal, do not use it. You will not get any real benefit
from discovering exactly what the assembler will accept.

Note: These are recommendations, not rules. You do not have to follow them but don’t blame us if you
waste time on unnecessary problems.

5.2 Operation Codes (Mnemonics):


One main task of the assembler is the translation of mnemonic operation codes into their binary
equivalents. The assembler performs this task using a fixed table much as you would if you were doing the
assembly by hand. The assembler must, however, do more than just translate the operation codes. It must also
somehow determine how many operands the instruction requires and what type they are. This may be rather
complex — some instructions (like a Stop) have no operands, others (like a Jump instruction) have one, while still
others (like a transfer between registers or a multiple-bit shift) require two. Some instructions may even allow
alternatives; for example, some computers have instructions (like Shift or Clear) which can either apply to a
register in the CPU or to a memory location. We will not discuss how the assembler makes these distinctions; we
will just note that it must do so.

5.3 Directives:
Some assembly language instructions are not directly translated into machine language instructions.
These instructions are directives to the assembler; they assign the program to certain areas in memory, define
symbols, designate areas of memory for data storage, place tables or other fixed data in memory, allow references
to other programs, and perform minor housekeeping functions.
To use these assembler directives or pseudo-operations a programmer places the directive’s mnemonic
in the operation code field, and, if the specified directive requires it, an address or data in the address field.
The most common directives are:
DEFINE CONSTANT (Data)
EQUATE (Define)
AREA
DEFINE STORAGE (Reserve)
ENTRY

Different assemblers use different names for those operations but their functions are the same.
Housekeeping directives include:

END LIST ORMAT TTL PAGE INCLUDE

We will discuss these pseudo-operations briefly, although their functions are usually obvious.

5.3.1 The DEFINE CONSTANT (Data) Directive:


The DEFINE CONSTANT directive allows the programmer to enter fixed data into program
memory. This data may include:
• Names • Conversion factors
• Messages • Key identifications
• Commands • Subroutine addresses
• Tax tables • Code conversion tables
• Thresholds • Identification patterns
• Test patterns • State transition tables
• Lookup tables • Synchronization patterns
• Standard forms • Coefficients for equations
• Masking patterns • Character generation patterns
• Weighting factors • Characteristic times or frequencies
The define constant directive treats the data as a permanent part of the program. The format of a define
constant directive is usually quite simple. An instruction like:

VTECH DCW 12
will place the number 12 in the next available memory location and assign that location the name BNG. Every DC
directive usually has a label, unless it is one of a series. The data and label may take any form that the assembler
permits.
More elaborate define constant directives that handle a large amount of data at one time are provided, for
example:
VTECH DCB ’ERROR’
SQRS DCW 1,4,9,16,25
A single directive may fill many bytes of program memory, limited perhaps by the length of a line or by
the restrictions of a particular assembler. Of course, you can always overcome any restrictions by following one
define constant directive with another:
MSG DCB " NOW IS THE”
DCB “TIME FOR ALL”
DCB “TO GO WITH VT”
DCB " TO GAIN THE”
DCB “KNOWLEDGE”
DCB “STATE ", 0 ; note the ’0’ terminating the string
Microprocessor assemblers typically have some variations of standard define constant directives. Define
Byte or DCB handles 8-bit numbers; Define Word or DCW handles 32-bit numbers or addresses. Other special
directives may handle character-coded data. The ARM assembler also defines DCD to (Define Constant Data)
which may be used in place of DCW.

5.3.2 The EQUATE Directive:


The EQUATE directive allows the programmer to equate names with addresses or data. This
pseudo-operation is almost always given the mnemonic EQU. The names may refer to device addresses, numeric
data, starting addresses, fixed addresses, etc.
The EQUATE directive assigns the numeric value in its operand field to the label in its label field.
Here are two examples:
HERE EQU 5
LAST EQU 5000
Most assemblers will allow you to define one label in terms of another, for example:
LAST EQU FINAL
ST1 EQU START+1
The label in the operand field must, of course, have been previously defined. Often, the operand field
may contain more complex expressions, as we shall see later. Double name assignments (two names for the same
data or address) may be useful in patching together programs that use different names for the same variable (or
different spellings of what was supposed to be the same name).
Note that an EQU directive does not cause the assembler to place anything in memory. The assembler
simply enters an additional name into a table (called a “symbol table”) which the assembler maintains.

When do you use a name? The answer is: whenever you have a parameter that you might want to change
or that has some meaning besides its ordinary numeric value. We typically assign names to time constants, device
addresses, masking patterns, conversion factors, and the like. A name like DELAY, HERE, KBD, KROW, or
OPEN not only makes the parameter easier to change, but it also adds to program documentation. We also assign
names to memory locations that have special purposes; they may hold data, mark the start of the program, or be
available for intermediate storage.
What name do you use? The best rules are much the same as in the case of labels, except that here
meaningful names really count. Why not call the teletypewriter TTY instead of X15, a bit time delay BTIME or
BTDLY rather than WW, the number of the “GO” key on a keyboard GOKEY rather than HORSE? This advice
seems straightforward, but a surprising number of programmers do not follow it.
Where do you place the EQUATE directives? The best place is at the start of the program, under
appropriate comment headings such as i/o addresses, temporary storage, time constants, or program locations.
This makes the definitions easy to find if you want to change them. Furthermore, another user will be able to look
up all the definitions in one centralized place. Clearly this practice improves documentation and makes the
program easier to use.
Definitions used only in a specific subroutine should appear at the start of the subroutine.
These are recommendations, not rules. You do not have to follow them but don’t blame us if you waste
time on unnecessary problems.
The AREA directive allows the programmer to specify the memory locations where programs,
subroutines, or data will reside. Programs and data may be located in different areas of memory depending on the
memory configuration. Startup routines interrupt service routines, and other required programs may be scattered
around memory at fixed or convenient addresses.
The assembler maintains a location counter (comparable to the computer’s program counter) which
contains the location in memory of the instruction or data item being processed. An area directive causes the
assembler to place a new value in the location counter, much as a Jump instruction causes the CPU to place a new
value in the program counter. The output from the assembler must not only contain instructions and data, but must
also indicate to the loader program where in memory it should place the instructions and data.
Microprocessor programs often contain several AREA statements for the following purposes:
• Reset (startup) address • Stack
• Interrupt service addresses • Main program
• Trap (software interrupt) addresses • Subroutines
• RAM storage • Input/output
Still other origin statements may allow room for later insertions, place tables or data in memory, or
assign vacant memory space for data buffers. Program and data memory in microcomputers may occupy widely
separate addresses to simplify the hardware. Typical origin statements are:
AREA RESET
AREA $1000
AREA INT3
The assembler will assume a fake address if the programmer does not put in an AREA statement. The AREA
statement at the start of an ARM program is required, and its absence will cause the assembly to fail.

5.3.3 The ENTRY Directive:

The ENTRY directive indicates the point in the code where program execution should begin. There
should be only ONE entry point per complete program. Note that in developing the software for an embedded
system, execution will begin at the reset vector, so the code entry point will be determined by what code is linked
at that address and the ENTRY directive is not used.

5.3.4 Housekeeping Directives:


There are various assembler directives that affect the operation of the assembler and its program listing
rather than the object program itself. Common directives include:
END, marks the end of the assembly language source program. This must appear in the file or a “missing END
directive” error will occur.
INCLUDE will include the contents of a named file into the current file. When the included file has been
processed the assembler will continue with the next line in the original file. For example the following line
INCLUDE MATH.S
will include the content of the file math.s at that point of the file.
You should never use a lable with an include directive. Any labels defined in the included file will be
defined in the current file, hence an error will be reported if the same label appears in both the source and include
file. An include file may itself include other files, which in turn could include other files, and so on, however, the
level of includes the assembler will accept is limited. It is not recommended you go beyond three levels for even
the most complex of software.

5.3.5 When to Use Labels:


Users often wonder if or when they can assign a label to an assembler directive. These are our
recommendations:
1. All EQU directives must have labels; they are useless otherwise, since the purpose of an EQU is to
define its label.
2. Define Constant and Define Storage directives usually have labels. The label identifies the first memory
location used or assigned.
3. Other directives should not have labels
The assembler allows the programmer a lot of freedom in describing the contents of the operand or
address field. But remember that the assembler has built-in names for registers and instructions and may have
other built-in names. We will now describe some common options for the operand field.
5.4.1 Decimal Numbers:
The assembler assumes all numbers to be decimal unless they are marked otherwise. So: ADD 100
means “add the contents of memory location 10010 to the contents of the Accumulator.”

5.4.2 Other Number Systems:


The assembler will also accept hexadecimal entries. But you must identify these number systems in
some way: for example, by preceding the number with an identifying character.

2_yyyy Binary Base 2


8_ yyyy Octal Base 8
yyyy Decimal Base 10
0xyyyy Hexadecimal Base 16

It is good practice to enter numbers in the base in which their meaning is the clearest: that is, decimal
constants in decimal; addresses and BCD numbers in hexadecimal; masking patterns or bit outputs in
hexadecimal.

5.4.3 Names:
Names can appear in the operand field; they will be treated as the data that they represent. Remember,
however, that there is a difference between operands and addresses. In an ARM assembly language program the
sequence:
FIVE EQU 5
ADD R2, #FIVE
will add the contents of memory location FIVE (not necessarily the number 5) to the contents of data register R2.

5.4.4 Character Codes:


The assembler allows text to be entered as ASCII strings. Such strings must be surrounded with double
quotation marks, unless a single ASCII character is quoted, when single qoutes may be used exactly as in ’C’. We
recommend that you use character strings for all text. It improves the clarity and readability of the program.

5.45 Arithmetic and Logical Expressions:


Assemblers permit combinations of the data forms described above, connected by arithmetic, logical, or
special operators. These combinations are called expressions. Almost all assemblers allow simple arithmetic
expressions such as START+1. Some assemblers also permit multiplication, division, logical functions, shifts,
etc. Note that the assembler evaluates expressions at assembly time; if a symbol appears in an expression, the
address is used (i.e., the location counter or EQUATE value).
Assemblers vary in what expressions they accept and how they interpret them. Complex expressions
make a program difficult to read and understand.

5.4.6 General Recommendations:

We have made some recommendations during this section but will repeat them and add others here. In
general, the user should strive for clarity and simplicity. There is no payoff for being an expert in the intricacies of
an assembler or in having the most complex expression on the block.
We suggest the following approach:
• Use the clearest number system or character code for data.
• Masks and BCD numbers in decimal, ASCII characters in octal, or ordinary numerical
constants in hexadecimal serve no purpose and therefore should not be used.
• Remember to distinguish data from addresses.
• Don’t use offsets from the location counter.
• Keep expressions simple and obvious. Don’t rely on obscure features of the assembler.

5.5 Comments:
All assemblers allow you to place comments in a source program. Comments have no effect on the
object code, but they help you to read, understand, and document the program. Good commenting is an essential
part of writing computer programs, programs without comments are very difficult to understand. We will discuss
commenting along with documentation in a later chapter, but here are some guidelines:
• Use comments to tell what application task the program is performing, not how the
microcomputer executes the instructions.
• Comments should say things like “is temperature above limit?”, “linefeed to TTY,” or “examine load
switch.”
• Comments should not say things like “add 1 to Accumulator,” “jump to Start,” or “look at carry.” You
should describe how the program is affecting the system; internal effects on the CPU should be
obvious from the code.
• Keep comments brief and to the point. Details should be available elsewhere in the documentation.
• Comment all key points.
• Do not comment standard instructions or sequences that change counters or pointers; pay special
attention to instructions that may not have an obvious meaning.
• Do not use obscure abbreviations.
• Make the comments neat and readable.
• Comment all definitions, describing their purposes. Also mark all tables and data storage areas.
• Comment sections of the program as well as individual instructions.
• Be consistent in your terminology. You can (should) be repetitive, you need not consult a thesaurus.
• Leave yourself notes at points that you find confusing: for example, “remember carry was set by last
instruction.” If such points get cleared up later in program development, you may drop these
comments in the final documentation. A well-commented program is easy to use. You will recover the
time spent in commenting many times over. We will try to show good commenting style in the
programming examples, although we often over-comment for instructional purposes.

5.6 Types of Assemblers:


Although all assemblers perform the same tasks, their implementations vary greatly. We will not try to
describe all the existing types of assemblers, we will merely define the terms and indicate some of the choices.
A cross-assembler is an assembler that runs on a computer other than the one for which it assembles
object programs. The computer on which the cross-assembler runs is typically a large computer with extensive
software support and fast peripherals. The computer for which the cross-assembler assembles programs is
typically a micro like the 6809 or MC68000.
When a new microcomputer is introduced, a cross-assembler is often provided to run on existing
development systems. For example, ARM provide the ’Armulator’ cross-assembler that will run on a PC
development system.
A self-assembler or resident assembler is an assembler that runs on the computer for which it assembles
programs. The self-assembler will require some memory and peripherals, and it may run quite slowly
compared to a cross-assembler.
A macro assembler is an assembler that allows you to define sequences of instructions as macros.
A micro assembler is an assembler used to write the micro programs which define the instruction set of
a computer. Microprogramming has nothing specifically to do with programming microcomputers, but has to
do with the internal operation of the computer.
A meta-assembler is an assembler that can handle many different instruction sets. The user must define
the particular instruction set being used.
A one-pass assembler is an assembler that goes through the assembly language program only once. Such
an assembler must have some way of resolving forward references, for example, Jump instructions which use
labels that have not yet been defined.
A two-pass assembler is an assembler that goes through the assembly language source program twice.
The first time the assembler simply collects and defines all the symbols; the second time it replaces the
references with the actual definitions. A two-pass assembler has no problems with forward references but may
be quite slow if no backup storage (like a floppy disk) is available; then the assembler must physically read the
program twice from a slow input medium (like a teletypewriter paper tape reader). Most microprocessor-based
assemblers require two passes.

5.7 Errors:
Assemblers normally provide error messages, often consisting of an error code number. Some typical
errors are:
Undefined name often a misspelling or an omitted definition
Illegal character such as a 2 in a binary number
Illegal format wrong delimiter or incorrect operands
Invalid expression for example, two operators in a row
Illegal value usually too large
Missing operand
Double definition two different values assigned to one name
Illegal label such as a label on a pseudo-operation that cannot have one
Missing label
Undefined operation code
In interpreting assembler errors, you must remember that the assembler may get on the wrong track if it
finds a stray letter, an extra space, or incorrect punctuation. The assembler will then proceed to misinterpret
the succeeding instructions and produce meaningless error messages. Always look at the first error very
carefully; subsequent ones may depend on it. Caution and consistent adherence to standard formats will
eliminate many annoying mistakes.

5.8 Loaders:
The loader is the program which actually takes the output (object code) from the assembler and places it
in memory. Loaders range from the very simple to the very complex. We will describe a few different types.
A bootstrap loader is a program that uses its own first few instructions to load the rest of itself or another
loader program into memory. The bootstrap loader may be in ROM, or you may have to enter it into the
computer memory using front panel switches. The assembler may place bootstrap loader at the start of the
object program that it produces.
A relocating loader can load programs anywhere in memory. It typically loads each program into the
memory space immediately following that used by the previous program. The programs, however, must
themselves be capable of being moved around in this way; that is, they must be relocatable. An absolute
loader, in contrast, will always place the programs in the same area of memory.
A linking loader loads programs and subroutines that have been assembled separately; it resolves
cross-references — that is, instructions in one program that refer to a label in another program. Object programs
loaded by a linking loader must be created by an assembler that allows external references. An alternative
approach is to separate the linking and loading functions and have the linking performed by a program called a
link editor and the loading done by a loader.

5.6 Subroutines
None of the examples that we have shown thus far is a typical program that would stand by itself. Most
real programs perform a series of tasks, many of which may be used a number of times or be common to other
programs. The standard method of producing programs which can be used in this manner is to write subroutines
that perform particular tasks. The resulting sequences of instructions can be written once, tested once, and then
used repeatedly.
There are special instructions for transferring control to subroutines and restoring control to the main program.
We often refer to the special instruction that transfers control to a subroutine as Call, Jump, or Brach to a
Subroutine. The special instruction that restores control to the main program is usually called Return.
In the ARM the Branch-and-Link instruction (BL) is used to Branch to a Subroutine. This saves the
current value of the program counter (PC or R15) in the Link Register (LR or R14) before placing the starting
address of the subroutine in the program counter. The ARM does not have a standard Return from Subroutine
instruction like other processors, rather the programmer should copy the value in the Link Register into the
Program Counter in order to return to the instruction after the Branch-and-Link instruction. Thus, to return from a
subroutine you should the instruction:

MOV PC, LR

Should the subroutine wish to call another subroutine it will have to save the value of the Link Register before
calling the nested subroutine.

5.8.1 Types of Subroutines


Sometimes a subroutine must have special characteristics.
Relocatable
The code can be placed anywhere in memory. You can use such a subroutine easily, regardless of other
programs or the arrangement of the memory. A relocating loader is necessary to place the program in memory
properly; the loader will start the program after other programs and will add the starting address or relocation
constant to all addresses in the program. Position Independent The code does not require a relocating loader — all
program addresses are expressed relative to the program counter’s current value. Data addresses are held
in-registers at all times. We will discuss the writing of position independent code later in this chapter.
Reentrant
The subroutine can be interrupted and called by the interrupting program, giving the correct results for
both the interrupting and interrupted programs. Reentrant subroutines are required for good for event based
systems such as a multitasking operating system (Windows or Unix) and embedded real time environments. It is
not difficult to make a subroutine reentrant. The only requirement is that the subroutine uses just registers and the
stack for its data storage, and the subroutine is self contained in that it does not use any value defined outside of
the routine (global values).
Recursive
The subroutine can call itself. Such a subroutine clearly must also be reentrant.

12.2 Subroutine Documentation


Most programs consist of a main program and several subroutines. This is useful as you can use known
prewritten routines when available and you can debug and test the other subroutines properly and remember their
exact effects on registers and memory locations.
You should provide sufficient documentation such that users need not examine the subroutine’s internal
structure. Among necessary specifications are:
• A description of the purpose of the subroutine
• A list of input and output parameters
• Registers and memory locations used
• A sample case, perhaps including a sample calling sequence
The subroutine will be easy to use if you follow these guidelines.

12.3 Parameter Passing Techniques


In order to be really useful, a subroutine must be general. For example, a subroutine that can perform
only a specialized task, such as looking for a particular letter in an input string of fixed length, will not be very
useful. If, on the other hand, the subroutine can look for any letter, in strings of any length, it will be far more
helpful.
In order to provide subroutines with this flexibility, it is necessary to provide them with the ability to
receive various kinds of information. We call data or addresses that we provide the subroutine parameters.
An important part of writing subroutines is providing for transferring the parameters to the subroutine.
This process is called Parameter Passing.
There are three general approaches to passing parameters:
1. Place the parameters in registers.
2. Place the parameters in a block of memory.
3. Transfer the parameters and results on the hardware stack.
The registers often provide a fast, convenient way of passing parameters and returning results. The
limitations of this method are that it cannot be expanded beyond the number of available registers; it often results
in unforeseen side effects; and it lacks generality. The trade-off here is between fast execution time and a more
general approach. Such a trade-off is common in computer applications at all levels. General approaches are easy
to learn and consistent; they can be automated through the use of macros. On the other hand, approaches that take
advantage of the specific features of a particular task require less time and memory. The choice of one approach
over the other depends on your application, but you should take the general approach (saving programming time
and simplifying documentation and maintenance) unless time or memory constraints force you to do otherwise.

12.3.1 Passing Parameters In Registers


The first and simplest method of passing parameters to a subroutine is via the registers. After calling a
subroutine, the calling program can load memory addresses, counters, and other data into registers. For example,
suppose a subroutine operates on two data buffers of equal length. The subroutine might specify that the length of
the two data buffers be in the register R0 while the staring address of the two data buffer are in the registers R1 and
R2. The calling program would then call the subroutine as follows:

MOV R0, #BufferLen ; Length of Buffer in R0


LDR R1, =BufferA ; Buffer A beginning address in R1
LDR R2, =BufferB ; Buffer B beginning address in R2
BL Subr ; Call subroutine
Using this method of parameter passing, the subroutine can simply assume that the parameters are there.
Results can also be returned in registers, or the addresses of locations for results can be passed as parameters via
the registers. Of course, this technique is limited by the number of registers available.
Processor features such as register indirect addressing, indexed addressing, and the ability to use any
register as a stack pointer allow far more powerful and general ways of passing parameters.

12.3.2 Passing Parameters In A Parameter Block


Parameters that are to be passed to a subroutine can also be placed into memory in a parameter block.
The location of this parameter block can be passed to the subroutine via a register.
LDR R0, =Params ; R0 Points to Parameter Block
BL Subr ; Call the subroutine
If you place the parameter block immediately after the subroutine call the address of the parameter block
is automatically place into the Link Register by the Branch and Link instruction. The subroutine must modify the
return address in the Link Register in addition to fetching the parameters. Using this technique, our example
would be modified as follows:
BL Subr
DCD BufferLen ; Buffer Length
DCD BufferA ; Buffer A starting address
DCD BufferB ; Buffer B starting address
The subroutine saves’ prior contents of CPU registers, then loads parameters and adjusts the return
address as follows:
Subr LDR R0, [LR], #4 ; Read BuufferLen
LDR R1, [LR], #4 ; Read address of Buffer A
LDR R2, [LR], #4 ; Read address of Buffer B
; LR points to next instruction
The addressing mode [LR], #4 will read the value at the address pointed to by the Link Register and then
move the register on by four bytes. Thus at the end of this sequence the value of LR has been updated to point to
the next instruction after the parameter block.
This parameter passing technique has the advantage of being easy to read. It has, however, the
disadvantage of requiring parameters to be fixed when the program is written. Passing the address of the
parameter block in via a register allows the papa meters to be changed as the program is running.

12.3.3 Passing Parameters On The Stack


Another common method of passing parameters to a subroutine is to push the parameters onto the stack.
Using this parameter passing technique, the subroutine call illustrated above would occur as follows:
MOV R0, #BufferLen ; Read Buffer Length
STR R0, [SP, #-4]! ; Save on the stack
LDR R0, =BufferA ; Read Address of Buffer A
STR R0, [SP, #-4]! ; Save on the stack
LDR R0, =BufferA ; Read Address of Buffer B
STR R0, [SP, #-4]! ; Save on the stack
BL Subr
The subroutine must begin by loading parameters into CPU registers as follows:
Subr STMIA R12, {R0, R1, R2, R12, R14} ; save working registers to stack
LDR R0, [R12, #0] ; Buffer Length in D0
LDR R1, [R12, #4] ; Buffer A starting address
LDR R2, [R12, #8] ; Buffer B starting address
... ; Main function of subroutine
LDMIA R12, {R0, R1, R2, R12, R14} ; Recover working registers
MOV PC, LR ; Return to caller

In this approach, all parameters are passed and results are returned on the stack.
The stack grows downward (toward lower addresses). This occurs because elements are pushed onto the stack
using the pre-decrement address mode. The use of the pre-decrement mode causes the stack pointer to always
contain the address of the last occupied location, rather than the next empty one as on some other
microprocessors. This implies that you must initialise the stack pointer to a value higher than the largest address in
the stack area.
When passing parameters on the stack, the programmer must implement this approach as follows:
1. Decrement the system stack pointer to make room for parameters on the system stack, and store
them using offsets from the stack pointer, or simply push the parameters on the stack.
2. Access the parameters by means of offsets from the system stack pointer.
3. Store the results on the stack by means of offsets from the systems stack pointer.
4. Clean up the stack before or after returning from the subroutine, so that the parameters are removed
and the results are handled appropriately.

12.4 Types Of Parameters


Regardless of our approach to passing parameters, we can specify the parameters in a variety of ways.
For example, we can:
Pass-by-value
Where the actual values are placed in the parameter list. The name comes from the fact that it is only the
value of the parameter that is passed into the subroutine rather than the parameter itself. This is the method used
by most high level programming languages.
Pass-by-reference
The address of the parameters are placed in the parameter list. The subroutine can access the value
directly rather than a copy of the parameter. This is much more dangerous as the subroutine can change a value
you don’t want it to.
Pass-by-name
Rather than passing either the value or a reference to the value a string containing the name of the
parameter is passed. This is used by very high level languages or scripting languages. This is very flexible but
rather time consuming as we need to look up the value associated with the variable name every time we wish to
access the variable.

APCS (ARM PROCEDURE CALL STANDARD)


In some situations, it is necessary to combine C language code and assembly code in the same program.
Some routines that are especially critical to system performance must, for example, be hand-coded in order to run
at optimum speed. Using the appropriate tools, you can generate an object file in both C and assembly language
source code. To do this, you use the ARM C Compiler and the ARM Assembler, respectively. You then link the
generated sources with one or more libraries to produce an executable file, as shown in Figure 2-1.

Assembly Source ARM ASM


codes

C-library ARM ARM Image


Link Code

C- Source Code ARM CC

Figure 12-1. Interleaving C and Assembly Language Source Code

Regardless of which language a routine is written in, routines that make cross-calls to other modules
must observe standard conventions for passing arguments and results. For ARM processors, this convention is
called the “ARM Procedure Call Standard, or APCS”. In this chapter, we briefly introduce APCS and discuss its
role for passing and returning values in ARM assembly language and C routines.
The ARM Procedure Call Standard, or APCS, is a set of rules which governs calls between functions in
separately compiled or assembled code fragments. The APCS defines:
 Constraints on the use of registers
 Stack conventions
 The format of a stack back-trace data structure
 Argument passing and result returns
 Support for the ARM shared library mechanism
The APCS standard consists of a family of variants. Each variant is exclusive, so that code which
conforms to one variant cannot be used with a code defined for another variant. Your choice of variant depends on
the following criteria:
 Whether stack limit checking is explicit (performed by code) or implicit (performed by memory
management hardware).
 Whether floating-point values are passed to floating-point registers.
 Whether code is reentrant or non-reentrant.

ARM APCS specification for register usage.


The following table summarizes the names and uses of ARM and floating-point registers under APCS.

Register APCS APCS


Number Name Function

R0 a1 argument 1 / integer result / scratch register


R1 a2 argument 2 / scratch register
R2 a3 argument 3 / scratch register
R3 a4 argument 4 / scratch register
R4 v1 register variable
R5 v2 register variable
R6 v3 register variable
R7 v4 register variable
R8 v5 register variable
R9 Sb/v6 stack base / register variable
R10 Sl/v7 stack limit / stack chunk handle / register variable
R11 FP frame pointer
R12 IP scratch register / new-sb in interlink unit calls
R13 SP lower end of current stack fram
R14 LR link address / scratch register
R15 PC program counter

Table 12-1. Register Names and Uses Under APCS

To summarize:

a1-a4 are used to pass arguments to functions. a1 is also used to return integer results. These registers can be
corrupted by a called function.

v1-v5 are used as register variables. They must be preserved by called functions.

Sb, Sl, FP, IP, SP, LR, PC sometimes have a dedicated role in an APCS variant. In other words, some of these
registers can be used for other purposes while also strictly conforming to APCS standard. In some APCS variants,

sb and sl are available as additional variable registers to v6 and v7, respectively.

PASSING AND RETURNING ARGUMENTS


As noted above, passing and returning arguments use the argument registers a1-a4. (These are the APCS
names for registers r0-r3, respectively. Using these registers, you can transfer up to four arguments directly (that
is, not using stack area). The arguments can be a mixture of C and assembly language code. Of course, if the
number of passed arguments exceeds four, you must use stack area. As shown in Figure 2-2, the caller returns the
result to the C function through register a1.

a1
a2 Assembler function
C-Function a3
a4

Stack Area

Figure 2-2. Passing and Returning Arguments

EXAMPLE CODE
The following two program samples include C and assembly code versions:

EXAMPLE A

Given, the number of passed parameters is < = 4.


C code: Call the assembly function to add four integer arguments and return the result value.
int ReturnVal;
ReturnVal = Add_ASM (0, 5, 10, 15)

Assembly code: Add four integer arguments and return the result value.
EXPORT Add_ASM ; Declare Add_ASM as extern function
Add_ASM
ADD a1, a1, a2 ; a1 = 0, a2 = 5, a3 = 10, a4 = 15
ADD a1, a1, a3
ADD a1, a1, a4 ; a1 = return value (= 30)
MOV pc, lr

EXAMPLE B

Given, the number of passed parameters is > 4.


C code: Call the assembly function to add six integer arguments and return the result value.
int ReturnVal;
ReturnVal = Add_ASM (0, 5, 10, 15, 20, 25)

Assembly code: Add six integer arguments and return the result value.

EXPORT Add_ASM ; Declare Add_ASM as extern function


Add_ASM
ADD a1, a1, a2 ; a1 = 0, a2 = 5, a3 = 10, a4 = 15
ADD a1, a1, a3
ADD a1, a1, a4
LDR v1, [sp, #4] ; v1 = 20 (parameter value from stack)
ADD a1, a1, v1
LDR v1, [sp, #8] ; v1 = 25
ADD a1, a1, v1

EXCEPTION HANDLING
An exception occurs when the normal flow of execution through a user program is diverted to allow the
processor to handle events generated by internal or external sources. Two examples of such events are:
— Externally-generated interrupts
— An attempt by the processor to execute an undefined instruction

When handling exceptions, the previous processor status must be preserved so that the execution of the
original user program can resume immediately after the appropriate execution routine is completed. The ARM
processor recognizes seven types of exceptions:

Reset Occurs when the CPU reset pin is asserted. This exception normally occurs
to signal a power-up, or to initiate a reset following CPU power-up. It is
therefore useful for initiating a “soft” reset.

Undefined Instruction Occurs if neither the CPU nor any attached coprocessor recognizes the
instruction currently being executed.
Software Interrupt A user-defined synchronous interrupt instruction which allows a program
running in User mode to request privileged operations that run in
Supervisor mode.
Pre-fetch Abort Occurs when the CPU attempts to execute an instruction which has been
pre-fetched from an illegal address. In this case, an illegal address is an
address that the memory management subsystem has determined is
inaccessible to the CPU in its current mode.
Data Abort Occurs when a data transfer instruction attempts to load or store data at an
illegal address.
IRQ Occurs when the CPU’s external interrupt request pin is asserted (Low)
and the I bit in the CPSR is clear.
FIQ Occurs when the CPU’s external fast interrupt request pin is asserted
(Low) and the F bit in the CPSR is clear.
Address Exception Mode on entry I state on entry F state on entry
0x00000000 Reset Supervisor Set Set
0x00000004 Undefined instruction Undefined Set Unchanged
0x00000008 Software interrupt Supervisor Set Unchanged
0x0000000C Prefetch Abort Abort Set Unchanged
0x00000010 Data Abort Abort Set Unchanged
0x00000014 Reserved Reserved - -
0x00000018 IRQ IRQ Set Unchanged
0x0000001C FIQ FIQ Set Set

Table 3-1. Exception Processing Modes and Vectors

The following table provides an overview of ARM exceptions and how they are processed.

Prefered return
Exception Priority Status Mode FIQ IRQ Vector
instruction
Reset 1 Not available Supervisor Disabled Disabled Base+0 Not available
Data Access 2 SPSR_abt=CPSR Abort Unchanged Disabled Base+16 SUBS PC,R14_abt,#8
Memory Abort
(Data Abort)
Fast Interrupt 3 SPSR_fiq=CPSR FIQ Disabled Disabled Base+28 SUBS PC,R14_fiq,#4
(FIQ)
Normal Interrupt 4 SPSR_irq=CPSR IRQ Unchanged Disabled Base+24 SUBS PC,R14_irq,#4
(IRQ)
Instruction Fetch 5 SPSR_abt=CPSR Abort Unchanged Disabled Base+12 SUBS PC,R14_abt,#4
Memory Abort
(Prefetch Abort)
Software 6 SPSR_svc=CPSR Supervisor Unchanged Disabled Base+8 MOVS PC,R14_svc
Interrupt (SWI)
Undefined 6 SPSR_und=CPSR Undefined Unchanged Disabled Base+4 MOVS PC,R14_und
Instruction

Table 3-1. Exception Priority, Modes and Vectors

Note

1: Priority 1 is highest, 6 is lowest.

2: The normal vector base address is 0x00000000. Some implementations allow the vector base address to be
moved to 0xFFFF0000.

3: When the instruction at the breakpoint causes a prefetch abort, then the abort request is handled first. When the
abort handler fixes the abort condition and returns to the aborted instruction, then the debug request is handled.

4: PC is the address of the instruction that caused the data abort.

5: PC is the address of the instruction that did not get executed after the interrupt occured.

6: PC is the address of the SWI, BKPT or undefined instruction or the instruction that had the prefetch abort.

7: Intentionally the FIQ vector is placed at the end of the vector table. No additional branch is required. The
handler can directly start at this location.

8: This re-executes the aborted instruction. If this is not intended, use SUBS PC,R14_abt,#4 instead.

Link Register Offset


The link register is used to return the PC (after handling the exception) to the appropriate place in the
interrupted task. It is modified based on the current PC value and the type of exception occurred. For some cases
it should point to the next instruction after the exception handling is done and in some other cases it should return
to one or 2 previous instructions to repeat those instructions after the exception handling is done. For example, in
the case of IRQ exception, the link register is pointing initially to the last executed instruction + 8, so after the
exception is handled we should return to the old PC value + 4 (next instruction) which equals to the old LR value
– 4. Another example is the data abort exception, in this case when the exception is handled, the PC should point
to the same instruction again to retry accessing the same memory location again.

Exception Address Use


Reset - LRr is not defined for Reset
Undefined Instruction LR Points to the next instruction after the Undefined
Instruction
Software Interrupt LR Points to the next instruction after the
SWI instruction
Prefetch Abort LR -4 Points to instruction that caused the
prefetch abort exeption
Data Abort LR -8 Points to instruction that caused the
data abort exeption
Reserved - -

Interrupt Request LR -4 Return address from IRQ Handler

Fast Interrupt Request LR -4 Return address from FIQ Handler

Table 3-1. Link Register Offset

Interrupts
There are two types of interrupts available on ARM processor. The first type is the interrupt caused by
external events from hardware peripherals and the second type is the SWI
instruction.
The ARM core has only one FIQ pin, that is why an external interrupt controller is always used so that
the system can have more than one interrupt source which are prioritized with this interrupt controller and then the
FIQ interrupt is raised and the handler identifies which of the external interrupts was raised and handle it.
3.1 How are interrupts assigned?
It is up to the system designer who can decide which hardware peripheral can produce which interrupt
request. By using an interrupt controller we can connect multiple external interrupts to one of the ARM interrupt
requests and distinguish between them
.There is a standard design for assigning interrupts adopted by system designers:
• SWIs are normally used to call privileged operating system routines.
• IRQs are normally assigned to general purpose interrupts like periodic timers.
• FIQ is reserved for one single interrupt source that requires fast response time, like DMA or any time
critical task that requires fast response.

3.2 Interrupt Latency


It is the interval of time between from an external interrupt signal being raised to the first fetch of an
instruction of the ISR of the raised interrupt signal.
System architects must balance between two things, first is to handle multiple interrupts simultaneously,
second is to minimize the interrupt latency.
Minimization of the interrupt latency is achieved by software handlers by two main methods, the first
one is to allow nested interrupt handling so the system can respond to new interrupts during handling an older
interrupt. This is achieved by enabling interrupts immediately after the interrupt source has been serviced but
before finishing the interrupt handling. The second one is the possibility to give priorities to different interrupt
sources; this is achieved by programming the interrupt controller to ignore interrupts of the same or lower priority
than the interrupt being handled if there is one.
3.3 IRQ and FIQ exceptions
Both exceptions occur when a specific interrupt mask is cleared in the CPSR. The ARM processor will
continue executing the current instruction in the pipeline before handling the interrupt. The processor hardware go
through the following standard procedure:
• The processor changes to a specific mode depending on the received interrupt.
• The previous mode CPSR is saved in SPSR of the new mode.
• The PC is saved in the LR of the new mode.
• Interrupts are disabled, either IRQ or both IRQ and FIQ.
• The processor branches to a specific entry in the vector table.
Enabling/Disabling FIQ and IRQ exceptions is done on three steps; at first loading the
contents of CPSR then setting/clearing the mask bit required then copy the updated contents
back to the CPSR.

3.4 Interrupt stack


Exception handling uses stacks extensively because each exception has a specific mode of operation, so
switching between modes occurs and saving the previous mode data is required before switching so that the core
can switch back to its old state successfully. Each mode has a dedicated register containing a stack pointer. The
design of these stacks depends on some factors like operating system requirements for stack design and target
hardware physical
limits on size and position in memory. Most of ARM based systems has the stack designed
such that the top of it is located at high memory address. A good stack design tries to avoid
stack overflow because this causes instability in embedded systems.
In the following figure we have two memory layouts which show how the stack is placed in memory:

User Stack Interrupt Stack

User Stack
Heap

Code Heap

Interrupt Stack Code

Vector Table
Vector Table

Figure 3 Typical Memory Layouts

The first is the traditional stack layout. The second layout has the advantage that when overflow occurs, the vector
table remains untouched so the system has the chance to correct itself.

4 Interrupt handling schemes


Here we introduce some interrupt handing schemes with some notes on each scheme about its
advantages and disadvantages.
4.1 Non-nested interrupt handling
This is the simplest interrupt handler. Interrupts are disabled until control is returned back to the
interrupted task. So only one interrupt can be served at a time and that is why this scheme is not suitable for
complex embedded systems which most probably have more than one interrupt source and require concurrent
handling. Figure 5 shows the steps taken to handle an interrupt:
Initially interrupts are disabled, When IRQ exception is raised and the ARM processor disables further
IRQ exceptions from occurring. The mode is changed to the new mode depending on the raised exception. The
register CPSR is copied to the SPSR of the new mode. Then the PC is set to the correct entry in the vector table
and the instruction there will direct the PC to the appropriate handler. Then the context of the current task is saved
a subset of the current mode non banked register. Then the interrupt handler executes some code to identify the
interrupt source and decide which ISR will be called. Then the appropriate ISR is called. And finally the context
of the interrupted task is restored, interrupts are enabled again and the control is returned to the interrupted task.

Disable
Interrupt

Save
Context
Interrupt
Handler
ISR

Restore Context
Resume to
Task Enable Interrupts

Figure 4 Simple non nested interrupt handlers

4.1.1 Non-nested interrupt handling summary:

• Handle and service individual interrupts sequentially.


• High interrupt latency.
• Relatively easy to implement and debug.
• Not suitable for complex embedded systems.

4.2 Nested interrupt handling

In this handling scheme handling more than one interrupt at a time is possible. This is achieved by
re-enabling interrupts before the handler has fully served the current interrupt. This feature increases the
complexity of the system but improves the latency. The scheme should be designed carefully to protect the
context saving and restoration from being interrupted. The designer should balance between efficiency and safety
by using defensive coding style that assumes problems will occur. The goal of nested handling is to respond to
interrupts quickly and to execute periodic tasks without any delays. Re-enabling interrupts requires switching out
of the IRQ mode to user mode to protect link register from being corrupted. Also performing context switch
requires emptying the IRQ stack because the handler will not perform switching if there is data on the IRQ stack,
so all registers saved on the IRQ stack have to be transferred to task stack. The part of the task stack used in this
process is called stack frame.
The main disadvantage of this interrupt handling scheme is that it doesn’t differ between interrupts by
priorities, so lower priority interrupt can block higher priority interrupts.

Disable

Save
Context

Interrupt
Handler
Interrupt

Serving is complete Done


?

Not yet complete

Prepare stack.
Switch mode.
Restore Context Construct a frame.
Enable interrupts.

Resume to
Task
Complete serving Interrupt
interrupt
Interrupt

Figure 5 Nested Interrupt Handling

4.2.1 Nested interrupt handling summary:

• Handle multiple interrupts without a priority assignment.


• Medium or high interrupt latency.
• Enable interrupts before the servicing of an individual interrupt is complete.
• No prioritization, so low priority interrupts can block higher priority interrupts.

4.3 Prioritized simple interrupt handling

In this scheme the handler will associate a priority level with a particular interrupt source. A higher
priority interrupt will take precedence over a lower priority interrupt. Handling prioritization can be done by
means of software or hardware. In case of hardware prioritization the handler is simpler to design because the
interrupt controller will give the interrupt signal of the highest priority interrupt requiring service. But on the other
side the system needs more initialization code at start-up since priority level tables have to be constructed before
the system being switched on.
Figure 6 Priority Interrupt Handler

When an interrupt signal is raised, a fixed amount of comparisons with the available set of priority levels
is done, so the interrupt latency is deterministic but at the same point this could be considered a disadvantage
because both high and low priority interrupts take the same amount of time.

4.3.1 Prioritized simple interrupt handling summary:


• Handle multiple interrupts with a priority assignment mechanism.
• Low interrupt latency.
• Deterministic interrupt latency.
• Time taken to get to a low priority ISR is the same as for high priority ISR.
4.4 Other schemes:
There are some other schemes for handling interrupts, designers have to choose the suitable one
depending on the system being designed.
4.4.1 Re-entrant interrupt handler:
The basic difference between this scheme and the nested interrupt handling is that interrupts are
re-enabled early on the re-entrant interrupt handler which can reduce interrupt latency. The interrupt of the source
is disabled before re-enabling interrupts to protect the system from getting infinite interrupt sequence. This is
done by a using a mask in the interrupt controller. By using this mask, prioritizing interrupts is possible but this
handler is more complex.

4.4.2 Prioritized standard interrupt handler:


It is the alternative approach of prioritized simple interrupt handler; it has the advantage of low interrupt
latency for higher priority interrupts than the lower priority ones. But the disadvantage now is that the interrupt
latency in nondeterministic.
4.4.3 Prioritized grouped interrupt handler:
This handler is designed to handle large amount of interrupts by grouping interrupts together and
forming a subset which can have a priority level. This way of grouping reduces the complexity of the handler
since it doesn’t scan through every interrupt to determine the priority. If the prioritized grouped interrupt handler
is well designed, it will improve the overall system response times dramatically, on the other hand if it is badly
designed such that interrupts are not well grouped, then some important interrupts will be dealt as low priority
interrupts and vice versa. The most complex and possibly critical part of such scheme is the decision on which
interrupts should be grouped together in one

Inline assembly.

We can instruct the compiler to insert the code of a function into the code of its callers, to the point where
actually the call is to be made. Such functions are inline functions. Similar to a Macro.

Benefit of inline:

This method of in lining reduces the function-call overhead. And if any of the actual argument values are
constant, their known values may permit simplifications at compile time so that not all of the inline function’s
code needs to be included. The effect on code size is less predictable, it depends on the particular case. To declare
an inline function, we’ve to use the keyword inline in its declaration.

Now we are in a position to guess what inline assembly is. It’s just some assembly routines written as
inline functions. They are handy, speedy and very much useful in system programming. Our main focus is to
study the basic format and usage of (GCC) inline assembly functions. To declare inline assembly functions, we
use the keyword asm.

Inline assembly is important primarily because of its ability to operate and make its output visible on C
variables. Because of this capability, "asm" works as an interface between the assembly instructions and the "C"
program that contains it.

Inline assembler syntax:


The ARM compiler supports an extended inline assembler syntax, introduced by the asm keyword (C++), or
the __asm keyword (C and C++). The syntax for these keywords is described in the following sections:

 Inline assembly with the __asm keyword

 Inline assembly with the asm keyword

 Rules for using __asm and asm.


You can use an asm or __asm statement anywhere a statement is expected.

Inline assembly with the __asm keyword:


The inline assembler is invoked with the assembler specifier, and is followed by a list of assembler
instructions inside braces or parentheses. You can specify inline assembler code using the following formats:

 On a single line, for example:

 __asm("instruction[;instruction]"); // Must be a single string

 __asm{instruction[;instruction]}
Note: You cannot include comments.

 On multiple lines, for example:

 __asm

 {

 ...

 instruction

 ...

 }
You can use C or C++ comments anywhere in an inline assembly language block.

Rules for using __asm and asm:


Follow these rules when using the __asm and asm keywords:

 If you include multiple instructions on the same line, you must separate them with a semicolon (;). If you
use double quotes, you must enclose all the instructions within a single set of double quotes (").

 If an instruction requires more than one line, you must specify the line continuation with the backslash
character (\).

 For the multiple line format, you can use C or C++ comments anywhere in the inline assembly language
block. However, you cannot embed comments in a line that contains multiple instructions.

 The comma (,) is used as a separator in assembly language, so C expressions with the comma operator
must be enclosed in parentheses to distinguish them:

 __asm

 {

 ADD x, y, (f(), z)

 }

 An asm statement must be inside a C++ function. An asm statement can be used anywhere a C++
statement is expected.

 Register names in the inline assembler are treated as C or C++ variables. They do not necessarily relate
to the physical register of the same name. If you do not declare the register as a C or C++ variable, the
compiler generates a warning.

 Do not save and restore registers in inline assembler. The compiler does this for you. Also, the inline
assembler does not provide direct access to the physical registersIf registers other than CPSR and SPSR
are read without being written to, an error message is issued.

For example:
int f(int x)
{
__asm
{
STMFD sp!, {r0} // save r0 - illegal: read before write
ADD r0, x, 1
EOR x, r0, x
LDMFD sp!, {r0} // restore r0 - not needed.
}
return x;
}
The function must be written as:

int f(int x)
{
int r0;
__asm
{
ADD r0, x, 1
EOR x, r0, x
}
return x;
}
Restrictions on inline assembly operations:

There are a number of restrictions on the operations that can be performed in inline assembly code.
These restrictions provide a measure of safety, and ensure that the assumptions in compiled C and C++ code are
not violated in the assembled assembly code.
The inline assembler has the following restrictions:
• The inline assembler is a high-level assembler, and the code it generates might not always be exactly what you
write. Do not use it to generate more efficient code than the compiler generates. Use embedded assembler or
the ARM assembler armasm for this purpose.
• Some low-level features that are available in the ARM assembler armasm, such as branching and writing to
PC, are not supported.
• Label expressions are not supported.
• You cannot get the address of the current instruction using dot notation (.) or {PC}.
• The & operator cannot be used to denote hexadecimal constants. Use the 0x prefix instead. For example:
__asm { AND x, y, 0xF00 }

• The notation to specify the actual rotation of an 8-bit constant is not available in inline assembly language.
This means that where an 8-bit shifted constant is used,the C flag must be regarded as corrupted if the NZCV
flags are updated.
• You must not modify the stack. This is not necessary because the compiler automatically stacks and restores
any working registers as required. The compiler does not permit you to explicitly stack and restore work
registers.

Embedded assembler:
The ARM compiler enables you to include assembly code out-of-line, in one or more C or C++ function
definitions. Embedded assembler provides unrestricted, low-level access to the target processor, enables you to
use the C and C++ preprocessor directives, and gives easy access to structure member offsets.

Embedded assembler syntax:


An embedded assembly function definition is marked by the __asm (C and C++) or asm (C++) function
qualifiers, and can be used on:
• member functions
• non-member functions
• template functions
• template class member functions.
Functions declared with __asm or asm can have arguments, and return a type. They are
called from C and C++ in the same way as normal C and C++ functions. The syntax of
an embedded assembly function is:
__asm return-type function-name(parameter-list)
{
// ARM/Thumb/Thumb-2 assembler code
instruction{;comment is optional}
...
instruction
}
The initial state of the embedded assembler (ARM or Thumb) is determined by the initial state of the
compiler, as specified on the command line. This means that:
• if the compiler starts in ARM state, the embedded assembler uses –arm
• if the compiler starts in Thumb state, the embedded assembler uses --thumb.
The embedded assembler state at the start of each function is as set by the invocation of the compiler, as
modified by #pragma arm and #pragma thumb pragmas.
You can change the state of the embedded assembler within a function by using explicit ARM, THUMB,
or CODE16 directives in the embedded assembler function. Such a directive within an __asm function does not
affect the ARM or Thumb state of subsequent __asm functions.
If you are compiling for a Thumb-2 capable processor, you can use Thumb-2 instructions when in
Thumb state.
Argument names are permitted in the parameter list, but they cannot be used in the body of the embedded
assembly function. For example, the following function uses integer I in the body of the function, but this is not
valid in assembly:
__asm int f(int i)
{
ADD i, i, #1 // error
}
You can use, for example, r0 instead of i.

Embedded assembler example:

Example 7-1 shows a string copy routine as an embedded assembler routine.

#include <stdio.h>
__asm void my_strcpy(const char *src, char *dst)
{
loop
LDRB r2, [r0], #1
STRB r2, [r1], #1
CMP r2, #0
BNE loop
BX lr
}
int main(void)
{
const char *a = "Hello world!";
char b[20];
my_strcpy (a, b);
printf("Original string: '%s'\n", a);
printf("Copied string: '%s'\n", b);
return 0;
}
Example 7-1 String copy with embedded assembler

Restrictions on embedded assembly:


The following restrictions apply to embedded assembly functions:
• After preprocessing, __asm functions can only contain assembly code, with the
exception of the following identifiers
__cpp(expr)
__offsetof_base(D, B)
__mcall_is_virtual(D, f)
__mcall_is_in_vbase (D, f)
__mcall_offsetof_base (D, f)
__mcall_this_offset(D, f)
__vcall_offsetof_vfunc (D, f)
• No return instructions are generated by the compiler for an __asm function. If you want to return from an __asm
function, then you must include the return instructions, in assembly code, in the body of the function.

Note: This makes it possible to fall through to the next function, because the embedded assembler guarantees to
emit the __asm functions in the order you have defined them. However, inlined and template functions behave
differently

• __asm functions do not change the AAPCS rules that apply. This means that all calls between an __asm function
and a normal C or C++ function must adhere to the AAPCS, even though there are no restrictions on the assembly
code that an __asm function can use (for example, change state).

Differences between expressions in embedded assembly and C or C++

Following are the differences between embedded assembly and C or C++:


• Assembler expressions are always unsigned. The same expression might have different values between
assembler and C or C++. For example:
MOV r0, #(-33554432 / 2) // result is 0x7f000000
MOV r0, #__cpp(-33554432 / 2) // result is 0xff000000
• Assembler numbers with leading zeros are still decimal. For example:
MOV r0, #0700 // decimal 700
MOV r0, #__cpp (0700) // octal 0700 == decimal 448
• Assembler operator precedence differs from C and C++. For example:
MOV r0, # (0x23 :AND: 0xf + 1) // ((0x23 & 0xf) + 1) => 4
MOV r0, #__cpp(0x23 & 0xf + 1) // (0x23 & (0xf + 1)) => 0
• Assembler strings are not null-terminated:
DCB "Hello world!" // 12 bytes (no trailing null)
DCB __cpp ("Hello world!") // 13 bytes (trailing null)

Calling between C, C++, and ARM assembly language:


Here we will tell you General rules for calling between languages This section provides examples that
may help you to call C and assembly language code from C++, and to call C++ code from C and assembly
language. It also describes calling conventions and data types.
You can mix calls between C and C++ and assembly language routines provided you follow the
appropriate procedure call standard. For more information on the APCS and TPCS,
Note:
The information in this section is implementation dependent and may change in future toolkit releases.
8.4.1 General rules for calling between languages:
The following general rules apply to calling between C, C++, and assembly language. You should not
rely on the following C++ implementation details. These implementation details are subject to change in future
releases of ARM C++:
• the way names are mangled
• the way the implicit this parameter is passed
• the way virtual functions are called
• the representation of references
• the layout of C++ class types that have base classes or virtual member functions
• the passing of class objects that are not plain old data (POD) structures.
The following general rules apply to mixed language programming:

• Use C calling conventions.


• In C++, non-member functions may be declared as extern "C" to specify that they have C linkage.
In this release of the ARM Software Development Toolkit, having C linkage means that the symbol
defining the function is not mangled. C linkage can be used to implement a function in one language
and call it from another. Note that functions that are declared extern "C" cannot be overloaded.
• Assembly language modules must conform to the appropriate ARM or Thumb Procedure Calls
Standard.

The following rules apply to calling C++ functions from C and assembly language:

• To call a global (non-member) C++ function, declare it extern "C" to give it C linkage.
• Member functions (both static and non-static) always have mangled names. You can determine the
mangled symbol by using decaof -sm on the object file that defines the function
• C++ inline functions cannot be called from C unless you ensure that the C++ compiler generates an
out-of-line copy of the function. For example, taking the address of the function results in an
out-of-line copy.
• Non-static member functions receive the implicit this parameter as a first argument in r0, or as a
second argument in r1 if the function returns a non int-like structure. Static member functions do not
receive an implicit this parameter.

Examples

The following code examples demonstrate how to:

• call assembly language from C


• call C from assembly language
• call C and assembly language functions from C++
• call C++ functions from C and assembly language
• call a non-static, non-virtual C++ member function from C or assembly language
• pass references between C and C++ functions.
The examples assume a no software stack checking and no frame pointer APCS variant.
Example 8-9 shows a C program that uses a call to an assembly language subroutine to
copy one string over the top of another string.

Example:
Calling assembly language from C

#include <stdio.h>
extern void strcopy (char *d, char *s);
int main ()
{ char *srcstr = "First string - source ";
char *dststr = "Second string - destination ";
printf ("Before copying:\n");
printf (" %s\n %s\n", srcstr, dststr);
strcopy (dststr, srcstr);
printf ("After copying:\n");
printf (" %s\n %s\n", srcstr, dststr);
return (0);
}

The ARM assembly language module that implements the string copy subroutine:

AREA SCopy, CODE, READONLY


EXPORT strcopy
Strco
; r0 points to destination string.
; r1 points to source string.
LDRB r2, [r1], #1 ; Load byte and update address.
STRB r2, [r0], #1 ; Store byte and update address.
CMP r2, #0 ; Check for zero terminator.
BNE strcopy ; Keep going if not.
MOV pc,lr ; Return.
END

Example 8-10

Calling C from assembly language

Define the function in C to be called from the assembly language routine:

int g(int a, int b, int c, int d, int e) { return a + b + c + d +e; }

In assembly language:

; int f(int i) { return -g(i, 2*i, 3*i, 4*i, 5*i); }


EXPORT f
AREA f, CODE, READONLY
IMPORT g
STR lr, [sp, #-4]! ; preserve lr
ADD r1, r0, r0 ; compute 2*i (2nd param)
ADD r2, r1, r0 ; compute 3*i (3rd param)
ADD r3, r1, r2 ; compute 5*i
STR r3, [sp, #-4]! ; 5th param on stack
ADD r3, r1, r1 ; compute 4*i (4th param)
BL g ; branch to C function
ADD sp, sp, #4 ; remove 5th param
RSB r0, r0, #0 ; negate result
LDR pc, [sp], #4 ; return
END
Chapter 6
Programming ARM in C
Introduction:

This chapter provides basic rules we need to follow while programming ARM7 in ‘C’. We discuss data types,
symbols, declaring variables, defining labels, Unary operators, Binary operators, Addition, subtraction, Shift
operators, Relational operators ,Boolean operates and logical operators which widely used while programming.

Rules for naming symbols:

The need to follow the below general rules for naming the symbol :
• Symbol names must be unique within their scope.
• Uppercase letters, lowercase letters, numeric characters, or the underscore character are legal symbol
names. Symbol names are case-sensitive, and all characters in the symbol name are significant.
• Numeric characters should not be used for the first character of symbol names, except in local labels.
• Symbols must not use the same name as built-in variable names or predefined symbol names.
• If you use the same name as an instruction mnemonic or directive, use double bars to delimit the
symbol name. For example: ||ASSERT|| The bars are not part of the symbol.
• You must not use the symbols |$a|, |$t|, |$t.x|, or |$d| as program labels. These are mapping symbols used
to mark the beginning of ARM, Thumb, ThumbEE, and data within the object file.
• Symbols beginning with the characters $v are mapping symbols that are related to VFP and might be
output when building for a target with VFP. You are recommended to avoid using symbols beginning
with $v in your source code.
If you have to use a wider range of characters in symbols, for example, when working with compilers,
use single bars to delimit the symbol name. For example:|.text| The bars are not part of the symbol. You
cannot use bars, semicolons, or newlines within the bars.
Variables
The value of a variable can be changed as assembly proceeds. Variables are local to the assembler. This means that
in the generated code or data, every instance of the variable has a fixed value.
Variables are of three types:
• Numerical
• Logical
• String.
The type of a variable cannot be changed.
The range of possible values of a numeric variable is the same as the range of possible values of a numeric
constant or numeric expression.
The possible values of a logical variable are {TRUE} or {FALSE}.
The range of possible values of a string variable is the same as the range of values of a string expression.
Use the GBLA, GBLL, GBLS, LCLA, LCLL, and LCLS directives to declare symbols representing variables, and
assign values to them using the SETA, SETL, and SETS directives.
8.2.1 Example
x SETA 100 ; Value of ‘x’ is 100
L1 MOV R1, #(x*2) ; in the object file, this is MOV R1, #200
x SETA 500 ; Value of ‘x’ is 500 only after this point.
; The previous instruction will always be MOV R1, #200

BNE L1 ; When the processor branches to L1, it executes
; MOV R1, #500
Numeric constants
These constants are 32-bit integers. Hay can be used as unsigned numbers in the range 0 to 232–1, or signed
numbers in the range –231 to 23 –1. But, the assembler will not differentiatebetween –n and 232–n. Relational
operators such as >= use the unsigned interpretation. This means that 0 > –1 is {FALSE}.EQU directive can be
used to define constants. Once the value of numeric constant is defined it cannot be changed. Expressions can be
defined by combining numeric constants and binary operators.
Assembly time substitution of variables
String variable can be used for a whole line of assembly language, or any part of a line. Use the variable with a $
prefix in the places where the value is to be substituted for the variable. The dollar character instructs the
assembler to substitute the string into the source code line before checking the syntax of the line. The assembler
faults if the substituted line is larger than the source line limit.

Numeric and logical variables can also be substituted. The current value of the variable is converted to a
hexadecimal string (or T or F for logical variables) before substitution.

Use a dot to mark the end of the variable name if the following character would be permissible in a symbol name.
You must set the contents of the variable before you can use it.

If you require a $ that you do not want to be substituted, use $$. This is converted to a single $.

You can include a variable with a $ prefix in a string. Substitution occurs in the same way as anywhere else.
Substitution does not occur within vertical bars, except that vertical bars within double quotes do not affect
substitution.
8.4.1 Example
;Substitution
GBLS v1
GBLS v2
GBLS here
GBLA count
;
count SETA 10
v1 SETS "a$$b$count" ; v1 now has value a$b0000000A
v2 SETS "abc"
here SETS "|xy$v2.z|" ; here now has value |xyabcz|
|C$$code| MOV r4,#16 ; but the label here is C$$code

Register-relative and PC-relative expressions


Addresses can be represented as a register-relative or PC-relative expression.

A register-relative expression evaluates to a named register combined with a numeric expression.

A PC-relative expression is written in source code as the PC or a label combined with a numeric expression. It can
also be expressed in the form [PC, #number]. It is represented in the instruction as the PC value plus or minus a
numeric offset. The assembler calculates the required offset from the label and the address of the current
instruction. If the offset is too big, the assembler produces an error.

It is recommended to write PC-relative expressions using labels rather than PC because the value of PC depends
on the instruction set.
Note
• In ARM state, the value of the PC is the address of the current instruction plus 8 bytes.
• In Thumb state:
— For B, BL, CBNZ, and CBZ instructions, the value of the PC is the address of the current instruction plus 4
bytes.
— For all other instructions that use labels, the value of the PC is the address of the current instruction plus 4
bytes, with bit[1] of the result cleared to 0 to make it word-aligned.
8.5.1 Example
LDR r4, =data+4*n ; n is an assembly-time variable
; code
MOV pc, lr
Data DCD value_0
; N-1 DCD directives
DCD value_n ; data+4*n points here
; More DCD directives
Labels
Labels are symbols representing the memory addresses of instructions or data. The address can be PC-relative,
register-relative, or absolute. Labels are local to the source file unless you make them global using the EXPORT
directive.

The address given by a label is calculated during assembly. The assembler calculates the address of a label relative
to the origin of the section where the label is defined. A reference to a label within the same section can use the PC
plus or minus an offset. This is called PC-relative addressing.

Addresses of labels in other sections are calculated at link time, when the linker has allocated specific locations in
memory for each section.
Labels for PC-relative addresses
These represent the PC, plus or minus a numeric value. Use them as targets for branch instructions, or to access
small items of data embedded in code sections. You can define PC-relative labels using a label on an instruction or
on one of the data definition directives.

You can also use the section name of an AREA directive as a label for PC-relative addresses. In this case the label
points to the first byte of the specified AREA. Using AREA names as branch targets is not recommended because
when branching from ARM to Thumb state or Thumb to ARM state in this way, the processor does not change the
state properly.
Labels for register-relative addresses
These represent a named register plus a numeric value. T hey are most often used to access data in data sections.
You can define them with a storage map. You can use the EQU directive to define additional register-relative
labels, based on labels defined in storage maps.
Example 8-1 Storage map definitions
MAP 0, r9
MAP 0xff, r9

Labels for absolute addresses


These are numeric constants. They are integers in the range 0 to 232–1. They address the memory directly. You
can use labels to represent absolute addresses using the EQU directive. You can specify the absolute address as
ARM, Thumb, or data to ensure that the labels are used correctly when referenced in code.
Example 8-2 Defining labels for absolute address

abc EQU 9 ; assigns the value 2 to the symbol abc.


xyz EQU label+3 ; assigns the address (label+8) to the Symbol xyz.
Fiq EQU 0x1C, CODE32 ; assigns the absolute address 0x1C to the symbol
fiq, and
; marks it as code32

Local labels
Local labels are a subclass of label. A local label is a number in the range 0-99, optionally followed by a name.
Unlike other labels, a local label can be defined many times and the same number can be used for more than one
local label in an area.

Local labels do not appear in the object file. This means that, for example, a debugger cannot set a breakpoint
directly on a local label, like it can for labels kept using the KEEP directive.

A local label can be used in place of symbol in source lines in an assembly language module:
• On its own, that is, where there is no instruction or directive
• On a line that contains an instruction
• On a line that contains a code- or data-generating directive.
A local label is generally used where you might use a PC-relative label.
Local labels are typically used for loops and conditional code within a routine, or for small subroutines that are
only used locally. They are particularly useful when you are generating labels in macros.

The scope of local labels is limited by the AREA directive. Use the ROUT directive to limit the scope of local
labels more tightly. A reference to a local label refers to a matching label within the same scope. If there is no
matching label within the scope in either direction, the assembler generates an error message and the assembly
fails.
You can use the same number for more than one local label even within the same scope. By default, the assembler
links a local label reference to:
• The most recent local label of the same number, if there is one within the scope
• The next following local label of the same number, if there is not a preceding one within the
scope. Use the optional parameters to modify this search pattern if required.

Syntax of local labels


The syntax of a local label is:
n{routname}
The syntax of a reference to a local label is:
%{F|B}{A|T}n{routname}
Where:
n is the number of the local label in range 0-99.
routname is the name of the current scope.
% introduces the reference.
F instructs the assembler to search forwards only.
B instructs the assembler to search backwards only.
A instructs the assembler to search all macro levels.
T instructs the assembler to look at this macro level only.

If neither F nor B is specified, the assembler searches backwards first, then forwards.
If neither A nor T is specified, the assembler searches all macros from the current level to the top level, but does
not search lower level macros.
If routname is specified in either a label or a reference to a label, the assembler checks it against the name of the
nearest preceding ROUT directive. If it does not match, the assembler generates an error message and the
assembly fails.
String expressions
String expressions consist of combinations of string literals, string variables, string manipulation operators, and
parentheses.

Characters that cannot be placed in string literals can be placed in string expressions using the : CHR: unary
operator. Any ASCII character from 0 to 255 is permitted.

The value of a string expression cannot exceed 5120 characters in length. It can be of zero length.
8.12.1 Example
improb SETS "vasundhara": CC :( strvar2: LEFT: 4)
; sets the variable improb to the value "vasundhara" with the left-most four characters
of
; The contents of string variable strvar2 appended
String literals
String literals consist of a series of characters or spaces contained between double quote characters. The length of
a string literal is restricted by the length of the input line.

To include a double quote character or a dollar character within the string literal, include the character twice as a
pair. For example, you must use $$ if you require a single $ in the string.

C string escape sequences are also enabled and can be used within the string, unless --no_esc is specified.

8.13.1 Examples
abc SETS "this string contains only one "" double quote"
def SETS "this string contains only one $$ dollar symbol"

Numeric expressions
Numeric expressions consist of combinations of numeric constants, numeric variables, ordinary numeric literals,
binary operators, and parentheses.
Numeric expressions can contain register-relative or program-relative expressions if the overall expression
evaluates to a value that does not include a register or the PC.
Numeric expressions evaluate to 32-bit integers. You can interpret them as unsigned numbers in the range 0 to
232–1, or signed numbers in the range –231 to 231–1. However, the assembler
makes no distinction between –n and 232–n. Relational operators such as >= use the unsigned interpretation. This
means that 0 > –1 is {FALSE}.
8.14.1 Example
x SETA 256*256 ; 256*256 is a numeric expression
MOV r1, #(x*22) ; (x*22) is a numeric expression
Numeric literals
Numeric literals can take any of the following forms:
decimal-digits
0xhexadecimal-digits
&hexadecimal-digits
n_base-n-digits
'character'

where:
decimal-digits Is a sequence of characters using only the digits 0 to 9.

hexadecimal-digits Is a sequence of characters using only the digits 0 to 9 and the letters A to F or a to f.

n_ Is a single digit between 2 and 9 inclusive, followed by an underscore character.

base-n-digits Is a sequence of characters using only the digits 0 to (n –1)

character Is any single character except a single quote. Use the standard C escape character (\') if you require a
single quote. The character must be enclosed within opening and closing single quotes. In this case the value of the
numeric literal is the numeric code of the character.

You must not use any other characters. The sequence of characters must evaluate to an integer in the range 0 to
232–1 (except in DCQ and DCQU directives, where the range is 0 to 264–1).

8.15.1 Examples

a SETA 34906 ;Loads a with decimal value of 34906


addr DCD 0xA10E ; Loads addr with hex value A10E
LDR r4, =&1000000F ; Loads R4 with hex value 1000000F
DCD 2_11001010 ;
c3 SETA 8_74007
DCQ 0x0123456789abcdef
LDR r1, ='A' ; pseudo-instruction loading 65 into r1
ADD r3, r2, #'\' ; add 39 (ASCII Value of `\`)to contents of r2, result to
r3
Floating-point literals
Floating-point literals can take any of the following forms:

{-}digits E{-}digits {-}{digits}.digits {-}{digits}.digitsE{-}digits 0xhexdigits


&hexdigits 0f_hexdigits 0d_hexdigits
where:
digits Are sequences of characters using only the digits 0 to 9. You can write E in
uppercase or lowercase. These forms correspond to normal floating-point
notation.
hexdigits Are sequences of characters using only the digits 0 to 9 and the letters A to F or
a to f. These forms correspond to the internal representation of the numbers in the
computer. Use these forms to enter infinities and NaNs, or if you want to be sure
of the exact bit patterns you are using.
The 0x and & forms allow the floating-point bit pattern to be specified by any number of hex
digits.
The 0f_ form requires the floating-point bit pattern to be specified by exactly 8 hex digits.
The 0d_ form requires the floating-point bit pattern to be specified by exactly 16 hex digits.
The range for single-precision floating-point values is:
• maximum 3.40282347e+38
• minimum 1.17549435e–38.
The range for double-precision floating-point values is:
• maximum 1.79769313486231571e+308
• minimum 2.22507385850720138e–308.
Floating-point numbers are only available if your system has VFP, or NEON with
floating-point.
8.16.1 Examples
DCFD 1E308,-4E-100
DCFS 1.0
DCFS 0.02
DCFD 3.725e15
DCFS 0x7FC00000; Quiet NaN
DCFD &FFF0000000000000; Minus infinity

Logical expressions

Logical expressions consist of combinations of logical literals ({TRUE} or {FALSE}), logical variables, Boolean
operators, relations, and parentheses.

Relations consist of combinations of variables, literals, constants, or expressions with appropriate relational
operators.

Logical literals
The logical or boolean literals can have one of two values:
• {TRUE}
• {FALSE}.
Unary operators

Unary operators have the highest precedence and are evaluated first. A unary operator precedes its operand.
Adjacent operators are evaluated from right to left.

Table 8-1 lists the unary operators that return strings.

Table 8-2 lists the unary operators that return numeric values.

Table 8-1 Unary operators that return strings

Operator Usage Description


:CHR: :CHR: A Returns the character with ASCII code A.
:LOWERCASE: :LOWERCASE: string Returns the given string, with all uppercase
characters converted to lowercase.
:REVERSE_CC: :REVERSE_CC:cond_code Returns the inverse of the condition code in
cond_code, or an error if cond_code does not
contain a valid condition code.
:STR: :STR:A Returns an 8-digit hexadecimal string
corresponding to a numeric expression, or the
string "T" or "F" if used on a logical expression.
:UPPERCASE: :UPPERCASE: string Returns the given string, with all lowercase
characters converted to uppercase.

Table 8-2 Unary operators that return numeric or logical values

Operator Usage Description


? ?A Number of bytes of executable code
generated by line defining symbol A.
+ and - +A Unary plus. Unary minus. + and – can act
–A on numeric and PC relative expressions.
:BASE: :BASE:A If A is a PC-relative or register-relative
expression, :BASE: returns the number
of its register component. :BASE: is
most useful in macros.
:CC_ENCODING: :CC_ENCODING:cond_code Returns the numeric value of the
condition code in cond_code, or an error
if cond_code does not contain a valid
condition code.
:DEF: :DEF:A {TRUE} if A is defined, otherwise
{FALSE}.
:INDEX: :INDEX:A If A is a register-relative expression,
:INDEX: returns the offset from that
base register. :INDEX: is most useful in
macros.
:LEN: :LEN:A Length of string A.
:LNOT: :LNOT:A Logical complement of A.
:NOT: :NOT:A Bitwise complement of A (~ is an alias,
for example ~A).
:RCONST: :RCONST:Rn Number of register, 0-15
corresponding to R0-R15.

Binary operators
Binary operators are written between the pair of sub-expressions they operate on.
Binary operators have lower precedence than unary operators. Binary operators appear in this section in order of
precedence.
Note
The order of precedence is not the same as in C.
Multiplicative operators
Multiplicative operators have the highest precedence of all binary operators. They act only on numeric
expressions.
Table 8-3 shows the multiplicative operators.

Operator Alias Usage Explanation

* A*B Multiply
/ A/B Divide
:MOD: % A:MOD:B A modulo B
String manipulation operators
Table 8-4 shows the string manipulation operators. In CC, both A and B must be strings. In the
slicing operators LEFT and RIGHT:
• A must be a string
• B must be a numeric expression.

Table 8-4 String manipulation operators

Operator Usage Explanation


:CC: A:CC:B B concatenated onto the end of A
:LEFT: A:LEFT:B The left-most B characters of A
:RIGHT: A:RIGHT:B The right-most B characters of A

Shift operators
Shift operators act on numeric expressions, shifting or rotating the first operand by the amount specified by the
second.
Table 8-5 shows the shift operators.
Note
SHR is a logical shift and does not propagate the sign bit.

Table 8-5 Shift operators


Operator Alias Usage Explanation
:ROL: A:ROL:B Rotate A left by B bits
:ROR: A:ROR:B Rotate A right by B bits
:SHL: << A:SHL:B Shift A left by B bits
:SHR: >> A:SHR:B Shift A right by B bits

Addition, subtraction, and logical operators


Addition and subtraction operators act on numeric expressions.
Logical operators act on numeric expressions. The operation is performed bitwise, that is, independently on each
bit of the operands to produce the result.

Table 8-6 shows addition, subtraction, and logical operators.

The use of | as an alias for :OR: is deprecated.

Table 8-6 Addition, subtraction, and logical operators


Operator Alias Usage Explanation
+ A+B Add A to B
- A-B Subtract B from A
:AND: & A:AND:B Bitwise AND of A and B
:EOR: ^ A:EOR:B Bitwise Exclusive OR of A and B
:OR: A:OR:B Bitwise OR of A and B

Relational operators
Table 8-7 shows the relational operators. These act on two operands of the same type to produce a logical value.
The operands can be one of:
• numeric
• PC-relative
• register-relative
• strings.
Strings are sorted using ASCII ordering. String A is less than string B if it is a leading substring of string B, or if
the left-most character in which the two strings differ is less in string A than in string B.
Arithmetic values are unsigned, so the value of 0>-1 is {FALSE}.

Table 8-7 Relational operators


Operator Alias Usage Explanation
= == A=B A equal to B
> A>B A greater than B
>= A>=B A greater than or equal to B
< A<B A less than B
<= A<=B A less than or equal to B
/= <>! = A/=B A not equal to B

Boolean operators
These are the operators with the lowest precedence. They perform the standard logical operations on their
operands.
In all three cases both A and B must be expressions that evaluate to either {TRUE} or {FALSE}.
Table 8-8 shows the Boolean operators.

Table 8-8 Boolean operators


Operator Alias Usage Explanation
:LAND: && A:LAND:B Logical AND of A and B
:LEOR: A:LEOR:B Logical Exclusive OR of A and B
:LOR: || A:LOR:B Logical OR of A and B

Operator precedence
The assembler includes an extensive set of operators for use in expressions. Many of the operators resemble their
counterparts in high-level languages such as C.
There is a strict order of precedence in their evaluation:
1. Expressions in parentheses are evaluated first.
2. Operators are applied in precedence order.
3. Adjacent unary operators are evaluated from right to left.
4. Binary operators of equal precedence are evaluated from left to right.

Difference between operator precedence in armasm and C


The assembler order of precedence is not exactly the same as in C.

For example, (1 + 2: SHR: 3) evaluates as (1 + (2: SHR: 3)) = 1 in armasm. The equivalent expression in C
evaluates as ((1 + 2) >> 3) = 0.
You are recommended to use brackets to make the precedence explicit.
If your code contains an expression that would parse differently in C, and you are not using the --unsafe option,
armasm normally gives a warning:
A1466W: Operator precedence means that expression would evaluate differently in C
Table 8-9 shows the order of precedence of operators in armasm, and a comparison with the order in C
(see Table 8-10).
From these tables:
• The highest precedence operators are at the top of the list.
• The highest precedence operators are evaluated first.
• Operators of equal precedence are evaluated from left to right.
Table 8-9 Operator precedence in armasm
armasm precedence equivalent C operators
unary operators unary operators
* / :MOD: */%
string manipulation n/a
:SHL: :SHR: :ROR: :ROL: << >>
+ - :AND: :OR: :EOR: +-&|^
= > >= < <= /= <> == > >= < <= !=
:LAND: :LOR: :LEOR: && ||

Table 8-10 Operator precedence in C


C precedence
unary operators
*/%
+ - (as binary operators)
<< >>
< <= > >=
== !=
&
^
Chapter 7
LPC 2148 CPU
Introduction:
LPC2148 Single-chip 16-bit/32-bit microcontrollers; up to 512 kB flash with ISP/IAP, USB 2.0 full-speed device,
10-bit ADC and DAC,these microcontrollers are based on a 16-bit/32-bit ARM7TDMI-SCPU with real-time
emulation and embedded trace support, that combine microcontroller with embedded high speed flash memory
ranging from 32 kB to 512 kB. A 128-bit wide memory interface and a unique accelerator architecture enable
32-bit code execution at the maximum clock rate. For critical code size applications, the alternative 16-bit Thumb
mode reduces code by more than 30 % with minimal performance penalty. Due to their tiny size and low power
consumption, LPC2141/42/44/46/48 are ideal for applications where miniaturization is a key requirement, such as
access control and point-of-sale. Serial communications interfaces ranging from a USB 2.0 Full-speed
device,multiple UARTs, SPI, SSP to I
The salient features of LPC 2148 microcontroller

16-bit/32-bit ARM7 microcontroller in a tiny LQFP64 package.8 kB to 40 kB of on-chip static RAM and 32 kB to
512 kB of on-chip flash memory.128-bit wide interface/accelerator enables high-speed 60 MHz operation.
In-System Programming/In-Application Programming (ISP/IAP) via on-chip boot loader Software. Single flash
sector or full chip erase in 400 ms and programming of 256 bytes in 1 ms.Embedded ICE RT and Embedded
Trace interfaces offer real-time debugging with the On-chip Real Monitor software and high-speed tracing of
instruction execution.USB 2.0 Full-speed compliant device controller with 2 kB of endpoint RAM. In addition,
the LPC2146/48 provides 8 kB of on-chip RAM accessible to USB by DMA. One or two 10-bit ADCs provide a
total of 6/14 Analog inputs, with conversion times as low as 2.44 µs per channel. Single 10-bit DAC provides
variable analog output .Two 32-bit timers/external event counters (with four capture and four compare Channels
each), PWM unit (six outputs) and watchdog. Low power Real-Time Clock (RTC) with independent power and
32 kHz clock input. Multiple serial interfaces including two UARTs (16C550), two Fast I 2C-bus (400 kbit/s),SPI
and SSP with buffering and variable data length capabilities. Vectored Interrupt Controller (VIC) with
configurable priorities and vector addresses. Up to 45 of 5 V tolerant fast general purpose I/O pins in a tiny
LQFP64 package. Up to 21 external interrupt pins available. 60 MHz maximum CPU clock available from
programmable on-chip PLL with settling time of 100 µs.On-chip integrated oscillator operates with an external
crystal from 1 MHz to 25 MHz.Power saving modes include idle and Power-down. Individual enable/disable of
peripheral functions as well as peripheral clock scaling for additional power optimization. Processor wake-up
from Power-down mode via external interrupt or BOD. Single power supply chip with POR and BOD circuits:
CPU operating voltage range of 3.0 V to 3.6 V (3.3 V ± 10 %) with 5 V tolerant I/O pads.

Know the function of pins of LPC 2148 microcontroller


Pin Symbol Type Description
No
I/O P0.21 — General purpose input/output digital pin (GPIO).
O PWM5 — Pulse Width Modulator output 5.
1 P0.21/PWM5/AD1.6/CAP1.3
I AD1.6 — ADC 1, input 6..
I CAP1.3 — Capture input for Timer 1, channel 3.
I/O P0.22 — General purpose input/output digital pin (GPIO).
I AD1.7 — ADC 1, input 7.
2 P0.22/AD1.7/CAP0.0/MAT0.0
I CAP0.0 — Capture input for Timer 0, channel 0.
O MAT0.0 — Match output for Timer 0, channel 0.
3 RTXC1 I Input to the RTC oscillator circuit
I/O P1.19 — General purpose input/output digital pin (GPIO).
4 P1.19/TRACEPKT3 TRACEPKT3 — Trace Packet, bit 3. Standard I/O port
O
with internal pull-up.
6 VSS I Ground: 0 V reference.
Analog 3.3 V power supply: This should be nominally the
same voltage as VDD but should be isolated to minimize
7 VDD I
noise and error. This voltage is only used to power the
on-chip ADC(s) and DAC.
I/O P1.18 — General purpose input/output digital pin (GPIO).
8 P1.18/TRACEPKT2 TRACEPKT2 — Trace Packet, bit 2. Standard I/O port
O
with internal pull-up.
I/O P0.25 — General purpose input/output digital pin (GPIO).
9 P0.25/AD0.4/AOUT I AD0.4 — ADC 0, input 4.
O AOUT — DAC output..
10 D+ I/O USB bidirectional D+ line.
11 D- I/O USB bidirectional D- line.
I/O P1.17 — General purpose input/output digital pin (GPIO).
12 P1.17/TRACEPKT1 TRACEPKT1 — Trace Packet, bit 1. Standard I/O port
O
with internal pull-up.
I/O P0.28 — General purpose input/output digital pin (GPIO).
13 P0.28/AD0.1/CAP0.2/MAT0.2
I AD0.1 — ADC 0, input 1.
I CAP0.2 — Capture input for Timer 0, channel 2.
O MAT0.2 — Match output for Timer 0, channel 2.
I/O P0.29 — General purpose input/output digital pin (GPIO).
I AD0.2 — ADC 0, input 2.
14 P0.29/AD0.2/CAP0.3/MAT0.3
I CAP0.3 — Capture input for Timer 0, Channel 3.
O MAT0.3 — Match output for Timer 0, channel 3.
I/O P0.30 — General purpose input/output digital pin (GPIO).
I AD0.3 — ADC 0, input 3.
15 P0.30/AD0.3/EINT3/CAP0.0
I EINT3 — External interrupt 3 input.
I CAP0.0 — Capture input for Timer 0, channel 0.
I/O P1.16 — General purpose input/output digital pin (GPIO).
16 P1.16/TRACEPKT0 TRACEPKT0 — Trace Packet, bit 0. Standard I/O port
O
with internal pull-up.
O P0.31 — General purpose output only digital pin (GPO).
UP_LED — USB Good Link LED indicator. It is LOW
when device is configured (non-control endpoints
O
enabled). It is HIGH when the device is not configured or
17 P0.31/UP_LED/CONNECT
during global suspend.
CONNECT — Signal used to switch an external 1.5 kΩ
O resistor under the software control. Used with the Soft
Connect USB feature.
18 VSS I Ground: 0 V reference.
I/O P0.0 — General purpose input/output digital pin (GPIO).
19 P0.0/TXD0/PWM1 O TXD0 — Transmitter output for UART0.
O PWM1 — Pulse Width Modulator output 1.
I/O P1.31 — General purpose input/output digital pin (GPIO).
20 P1.31/TRST
O TRST — Test Reset for JTAG interface.
I/O P0.1 — General purpose input/output digital pin (GPIO).
I RXD0 — Receiver input for UART0.
21 P0.1/RXD0/PWM3/EINT0
O PWM3 — Pulse Width Modulator output 3.
I EINT0 — External interrupt 0 input
I/O P0.2 — General purpose input/output digital pin (GPIO).
SCL0 — I2C0 clock input/output. Open-drain output (for
22 P0.2/SCL0/CAP0.0 I/O
I2C-bus compliance).
I CAP0.0 — Capture input for Timer 0, channel 0.
3.3 V power supply: This is the power supply voltage for
23 VDD I
the core and I/O ports
I/O P1.26 — General purpose input/output digital pin (GPIO).
RTCK — Returned Test Clock output. Extra signal added
24 P1.26/RTCK to the JTAG port. Assists debugger synchronization when
I/O
processor frequency varies. Bidirectional pin with internal
pull-up.
25 VSS I Ground: 0 V reference.
I/O P0.3 — General purpose input/output digital pin (GPIO).
SDA0 — I2C0 data input/output. Open-drain output (for
I/O
26 P0.3/SDA0/MAT0.0/EINT1 I2C-bus compliance).
O MAT0.0 — Match output for Timer 0, channel 0.
I EINT1 — External interrupt 1 input.
I/O P0.4 — General purpose input/output digital pin (GPIO).
SCK0 — Serial clock for SPI0. SPI clock output from
I/O
27 P0.4/SCK0/CAP0.1/AD0.6 master or input to slave.
I CAP0.1 — Capture input for Timer 0, channel 0.
I AD0.6 — ADC 0, input 6
I/O P1.25 — General purpose input/output digital pin (GPIO).
28 P1.25/EXTIN0 EXTIN0 — External Trigger Input. Standard I/O with
I
internal pull-up.
I/O P0.5 — General purpose input/output digital pin (GPIO).
29 P0.5/MISO0/MAT0.1/AD0.7
I/O MISO0 — Master In Slave OUT for SPI0. Data input to
SPI master or data output from SPI slave.
O MAT0.1 — Match output for Timer 0, channel 1.
I AD0.7 — ADC 0, input 7.
I/O P0.6 — General purpose input/output digital pin (GPIO).
MOSI0 — Master Out Slave In for SPI0. Data output
I/O
30 P0.6/MOSI0/CAP0.2/AD1.0 from SPI master or data input to SPI slave
I CAP0.2 — Capture input for Timer 0, channel 2.
I AD1.0 — ADC 1, input 0.
I/O P0.7 — General purpose input/output digital pin (GPIO).
SSEL0 — Slave Select for SPI0. Selects the SPI interface
I
31 P0.7/SSEL0/PWM2/EINT2 as a slave
O PWM2 — Pulse Width Modulator output 2.
I EINT2 — External interrupt 2 input.
I/O P1.24 — General purpose input/output digital pin (GPIO).
32 P1.24/TRACECLK TRACECLK — Trace Clock. Standard I/O port with
O
internal pull-up.
I/O P0.8 — General purpose input/output digital pin (GPIO).
O TXD1 — Transmitter output for UART1.
33 P0.8/TXD1/PWM4/AD1.
O PWM4 — Pulse Width Modulator output 4
I AD1.1 — ADC 1, input 1.
34 P0.9/RXD1/PWM6/EINT3 I/O P0.9 — General purpose input/output digital pin (GPIO).
I RXD1 — Receiver input for UART1.
O PWM6 — Pulse Width Modulator output 6.
I EINT3 — External interrupt 3 input.
I/O P0.10 — General purpose input/output digital pin (GPIO).
O RTS1 — Request to Send output for UART1
35 P0.10/RTS1/CAP1.0/AD1.2
I CAP1.0 — Capture input for Timer 1, channel 0.
I AD1.2 — ADC 1, input 2.
36 P1.23/PIPESTAT2 I/O P1.23 — General purpose input/output digital pin (GPIO).
PIPESTAT2 — Pipeline Status, bit 2. Standard I/O port
O
with internal pull-up.
37 P0.11/CTS1/CAP1.1/SCL1 I/O P0.11 — General purpose input/output digital pin (GPIO).
I CTS1 — Clear to Send input for UART1
I CAP1.1 — Capture input for Timer 1, channel 1.
SCL1 — I2C1 clock input/output. Open-drain output (for
I/O
I2C-bus compliance)
38 P0.12/DSR1/MAT1.0/AD1.3 I/O P0.12 — General purpose input/output digital pin (GPIO).
I DSR1 — Data Set Ready input for UART1.
O MAT1.0 — Match output for Timer 1, channel 0.
I AD1.3 — ADC input 3.
39 P0.13/DTR1/MAT1.1/AD1.4 I/O P0.13 — General purpose input/output digital pin (GPIO).
O DTR1 — Data Terminal Ready output for UART1.
O MAT1.1 — Match output for Timer 1, channel 1.
I AD1.4 — ADC input 4.
40 P1.22/PIPESTAT1 I/O P1.22 — General purpose input/output digital pin (GPIO).
PIPESTAT1 — Pipeline Status, bit 1. Standard I/O port
O
with internal pull-up.
41 P0.14/DCD1/EINT1/SDA1 I/O P0.14 — General purpose input/output digital pin (GPIO).
I DCD1 — Data Carrier Detect input for UART1.
I EINT1 — External interrupt 1 input.
SDA1 — I2C1 data input/output. Open-drain output (for
I/O
I2C-bus compliance)
42 VSS I Ground: 0 V reference.
3.3 V power supply: This is the power supply voltage for
43 VDD I
the core and I/O ports.
44 P1.21/PIPESTAT0 I/O P1.21 — General purpose input/output digital pin (GPIO).
PIPESTAT0 — Pipeline Status, bit 0. Standard I/O port
O
with internal pull-up.
45 P0.15/RI1/EINT2/AD1.5 I/O P0.15 — General purpose input/output digital pin (GPIO).
I RI1 — Ring Indicator input for UART1
I EINT2 — External interrupt 2 input.
I AD1.5 — ADC 1, input 5..
46 P0.16/EINT0/MAT0.2/CAP0.2 I/O P0.16 — General purpose input/output digital pin (GPIO).
I EINT0 — External interrupt 0 input.
O MAT0.2 — Match output for Timer 0, channel 2.
I CAP0.2 — Capture input for Timer 0, channel 2.
47 P0.17/CAP1.2/SCK1/MAT1.2 I/O P0.17 — General purpose input/output digital pin (GPIO).
I CAP1.2 — Capture input for Timer 1, channel 2.
SCK1 — Serial Clock for SSP. Clock output from master
I/O
or input to slave.
O MAT1.2 — Match output for Timer 1, channel 2.
48 P1.20/TRACESYNC I/O P1.20 — General purpose input/output digital pin (GPIO).
TRACESYNC — Trace Synchronization. Standard I/O
O
port with internal pull-up.
RTC power supply: 3.3 V on this pin supplies the power
49 VBA I
to the RTC.
50 VSS I Ground: 0 V reference.
3.3 V power supply: This is the power supply voltage
51 VDD I
For the core and I/O ports.
52 P1.30/TMS I/O P1.30 — General purpose input/output digital pin (GPIO).
I TMS — Test Mode Select for JTAG interface.
53 P0.18/CAP1.3/MISO1/MAT1.3 I/O P0.18 — General purpose input/output digital pin (GPIO).
I CAP1.3 — Capture input for Timer 1, channel 3.
MISO1 — Master In Slave Out for SSP. Data input to SPI
I/O
master or data output from SSP slave.
O MAT1.3 — Match output for Timer 1, channel 3.
54 P0.19/MAT1.2/MOSI1/CAP1.2 I/O P0.19 — General purpose input/output digital pin (GPIO).
O MAT1.2 — Match output for Timer 1, channel 2.
MOSI1 — Master Out Slave In for SSP. Data output from
I/O
SSP master or data input to SSP slave.
I CAP1.2 — Capture input for Timer 1, channel 2.
55 P0.20/MAT1.3/SSEL1/EINT3 I/O P0.20 — General purpose input/output digital pin (GPIO).
O MAT1.3 — Match output for Timer 1, channel 3.
SSEL1 — Slave Select for SSP. Selects the SSP interface
I
as a slave.
I EINT3 — External interrupt 3 input.
56 P1.29/TCK I/O P1.29 — General purpose input/output digital pin (GPIO).
I TCK — Test Clock for JTAG interface.
External reset input: A LOW on this pin resets the device
causing I/O ports and peripherals to take on their default
57 RESET I
states, and processor execution to begin at address 0. TTL
with hysteresis, 5 V tolerant.
58 P0.23/VBUS I/O P0.23 — General purpose input/output digital pin (GPIO).
VBUS — Indicates the presence of USB bus power.
I
Note: This signal must be HIGH for USB reset to occur.
Analog ground: 0 V reference. This should nominally be
59 SSA I the same voltage as VSS, but should be isolated to
minimize noise and error.
60 P1.28/TDI I/O P1.28 — General purpose input/output digital pin (GPIO).
I TDI — Test Data in for JTAG interface.
61 XTAL2 O Output from the oscillator amplifier.
Input to the oscillator circuit and internal clock generator
62 XTAL1 I
circuits.
ADC reference: This should be nominally less than or
63 VREF I equal to the VDD voltage but should be isolated to
minimize noise and error. Level on this pin is used as a
reference for ADC(s) and DAC.
64 P1.27/TDO I/O P1.27 — General purpose input/output digital pin (GPIO).
O TDO — Test Data out for JTAG interface.

Function of different blocks of LPC2148


On-chip flash program memory:
The LPC2148 incorporate a 512 kB flash memory system respectively. This memory may be used for both code
and data storage. Programming of the flash memory may be accomplished in several ways. It may be programmed
In System via the serial port. The application program may also erase and/or program the flash while the
application is running, allowing a great degree of flexibility for data storage field firmware upgrades, etc. Due to
the architectural solution chosen for an on-chip boot loader, flash memory available for user’s code on LPC2148
is 500 kB. The flash program memory provides a minimum of 100,000 erase/write cycles and 20 years of
data-retention.
On-chip Static RAM memory:
On-chip static RAM may be used for code and/or data storage. The SRAM may be accessed as 8-bit, 16-bit, and
32-bit. The LPC2148 provide 32 kB of static RAM .In case of LPC2148 , an 8 kB SRAM block intended to be
utilized mainly by the USB, can also be used as a general purpose RAM for data storage and code storage and
execution.
Interrupt controller:
The Vectored Interrupt Controller (VIC) accepts all of the interrupt request inputs and group them as Fast
Interrupt Request (FIQ), vectored Interrupt Request (IRQ), and non-vectored IRQ. The various peripherals can
dynamically assignee and adjust the priorities of interrupts by programmable assignment scheme. Fast interrupt
request (FIQ) has the highest priority. If more than one request is assigned to FIQ, the VIC combines the requests
to produce the FIQ signal to the ARM processor. The fastest possible FIQ latency is achieved when only one
request is classified as FIQ, because then the FIQ service routine does not need to branch into the interrupt service
routine but can run from the interrupt vector location. If more than one request is assigned to the FIQ class, the
FIQ service routine will read a word from the VIC that identifies which FIQ source(s) is (are) requesting an
interrupt. Vectored IRQs have the middle priority. Sixteen of the interrupt requests can be assigned to this
category. Any of the interrupt requests can be assigned to any of the 16 vectored IRQ slots, among which slot 0
has the highest priority and slot 15 has the lowest. Non-vectored IRQs have the lowest priority. The VIC
combines the requests from all the vectored and non-vectored IRQs to produce the IRQ signal to the ARM
processor. The IRQ service routine can start by reading a register from the VIC and jumping there. If any of the
vectored IRQs are pending, the VIC provides the address of the highest-priority requesting IRQs service routine,
otherwise it provides the address of a default routine that is shared by all the non-vectored IRQs. The default
routine can read another VIC register to see what IRQs are active.

Pin connect block:


The pin connect block allows selected pins of the microcontroller to have more than one function. Configuration
registers control the multiplexers to allow connection between the pin and the on chip peripherals. Peripherals
should be connected to the appropriate pins prior to being activated, and prior to any related interrupt(s) being
enabled. Activity of any enabled peripheral function that is not mapped to a related pin should be considered
undefined. The Pin Control Module with its pin select registers defines the functionality of the microcontroller in
a given hardware environment. After reset all pins of Port 0 and 1 are configured as input with the following
exceptions: If debug is enabled, the JTAG pins will assume their JTAG functionality; if trace is enabled, the Trace
pins will assume their trace functionality. The pins associated with the I2C0 and I2C1 interface are open drain.
DAC(Digital To Analog Converter)
The DAC enables the LPC2148 to generate a variable analog output. The maximum DAC output voltage is the
VREF voltage.
• 10-bit DAC.
• Buffered output.
• Power-down mode available.
ADC(Analog to Digital Converter):
The LPC2148 contain two analog to digital converters. These converters are single 10-bit successive
approximation analog to digital converters. While ADC0 has six channels, ADC1 has eight channels. Therefore,
total number of available ADC inputs for LPC2148 is 14.
Features
• 10 bit successive approximation analog to digital converter.
• Measurement range of 0 V to VREF (2.0 V ≤ VREF ≤ VDDA).
• Each converter capable of performing more than 400,000 10-bit samples per second.
• Every analog input has a dedicated result register to reduce interrupt overhead.
• Burst conversion mode for single or multiple inputs.
• Optional conversion on transition on input pin or timer match signal.
• Global Start command for both converters.
USB 2.0 device controller:
The USB is a 4-wire serial bus that supports communication between a host and a number (127 max) of
peripherals. The host controller allocates the USB bandwidth to attached devices through a token based protocol.
The bus supports hot plugging, unplugging, and dynamic configuration of the devices. All transactions are
initiated by the host controller. The LPC2148 is equipped with a USB device controller that enables 12 Mbit/s
data exchange with a USB host controller. It consists of a register interface, serial interface engine, endpoint
buffer memory and DMA controller. The serial interface engine decodes the USB data stream and writes data to
the appropriate end point buffer memory. The status of a completed USB transfer or error condition is indicated
via status registers. An interrupt is also generated if enabled. A DMA controller can transfer data between an
endpoint buffer and the USB RAM
Features
• Fully compliant with USB 2.0 Full-speed specification.
• Supports 32 physical (16 logical) endpoints.
• Supports control, bulk, interrupt and isochronous endpoints.
• Scalable realization of endpoints at run time.
• Endpoint maximum packet size selection (up to USB maximum specification) by
software at run time.
• RAM message buffer size based on endpoint realization and maximum packet size.
• Supports SoftConnect and GoodLink LED indicator. These two functions are sharing one pin.
• Supports bus-powered capability with low suspend current.
• Supports DMA transfer on all non-control endpoints.
• One duplex DMA channel serves all endpoints.
• Allows dynamic switching between CPU controlled and DMA modes.
• Double buffer implementation for bulk and isochronous endpoints.
UARTs
The LPC2148 each contain two UARTs. In addition to standard transmit and receive data lines, the LPC2148
UART1 also provides a full modem control handshake interface. Compared to previous LPC2000
microcontrollers, UARTs in LPC2148 introduce a fractional baud rate generator for both UARTs, enabling these
microcontrollers to achieve standard baud rates such as 115200 with any crystal frequency above 2 MHz. In
addition, auto-CTS/RTS flow-control functions are fully implemented in hardware
Features
• 16 byte Receive and Transmit FIFOs.
• Register locations conform to ‘550 industry standard.
• Receiver FIFO trigger points at 1, 4, 8, and 14 bytes
• Built-in fractional baud rate generator covering wide range of baud rates without a need for external crystals of
particular values.
• Transmission FIFO control enables implementation of software (XON/XOFF) flow control on both UARTs.
• LPC2148 UART1 equipped with standard modem interface signals. This module also provides full support for
hardware flow control (auto-CTS/RTS).
I2C-bus serial I/O controller
The LPC2148 each contain two I2C-bus controllers. The I2C-bus is bidirectional, for inter-IC control using only
two wires: a serial clock line (SCL), and a serial data line (SDA). Each device is recognized by a unique address
and can operate as either a receiver-only device (e.g., an LCD driver or a transmitter with the capability to both
receive and send information (such as memory)). Transmitters and/or receivers can operate in either master or
slave mode, depending on whether the chip has to initiate a data transfer or is only addressed. The I2C-bus is a
multi-master bus, it can be controlled by more than one bus master connected to it. The I2C-bus implemented in
LPC2148 supports bit rates up to 400 kbit/s (Fast I2C-bus).
Features
• Compliant with standard I2C-bus interface.
• Easy to configure as master, slave, or master/slave.
• Programmable clocks allow versatile rate control.
• Bidirectional data transfer between masters and slaves.
• Multi-master bus (no central master).
• Arbitration between simultaneously transmitting masters without corruption of serial data on the bus.
• Serial clock synchronization allows devices with different bit rates to communicate via one serial bus.
• Serial clock synchronization can be used as a handshake mechanism to suspend and resume serial transfer.
• The I2C-bus can be used for test and diagnostic purposes.
SPI serial I/O controller
The LPC2148 each contain one SPI controller. The SPI is a full duplex serial interface, designed to handle
multiple masters and slaves connected to a given bus. Only a single master and a single slave can communicate on
the interface during a given data transfer. During a data transfer the master always sends a byte of data to the
slave, and the slave always sends a byte of data to the master.
Features
• Compliant with Serial Peripheral Interface (SPI) specification.
• Synchronous, Serial, Full Duplex, Communication.
• Combined SPI master and slave.
• Maximum data bit rate of one eighth of the input clock rate.
SSP serial I/O controller
The LPC2148 each contain one SSP. The SSP controller is capable of operation on a SPI, 4-wire SSI, or Micro
wire bus. It can interact with multiple masters and slaves on the bus. However, only a single master and a single
slave can communicate on the bus during a given data transfer. The SSP supports full duplex transfers, with data
frames of 4 bits to 16 bits of data flowing from the master to the slave and from the slave to the master. Often only
one of these data flows carries meaningful data.
Features
• Compatible with Motorola’s SPI, TI’s 4-wire SSI and National Semiconductor’s Micro wire buses.
• Synchronous serial communication.
• Master or slave operation.
• 8-frame FIFOs for both transmit and receive.
• Four bits to 16 bits per frame.
General purpose timers/external event counters
The Timer/Counter is designed to count cycles of the peripheral clock (PCLK) or an externally supplied clock and
optionally generate interrupts or perform other actions at specified timer values, based on four match registers. It
also includes four capture inputs to trap the timer value when an input signal transitions, optionally generating an
interrupt. Multiple pins can be selected to perform a single capture or match function, providing an application
with ‘or’ and ‘and’, as well as ‘broadcast’ functions among them. The LPC2148 can count external events on one
of the capture inputs if the minimum external pulse is equal or longer than a period of the PCLK. In this
configuration, unused capture lines can be selected as regular timer capture inputs, or used as external interrupts.
Features
• A 32-bit timer/counter with a programmable 32-bit prescaler.
• External event counter or timer operation.
• Four 32-bit capture channels per timer/counter that can take a snapshot of the timer value when an input signal
transitions. A capture event may also optionally generate an interrupt.
• Four 32-bit match registers that allow:
– Continuous operation with optional interrupt generation on match.
– Stop timer on match with optional interrupt generation.
– Reset timer on match with optional interrupt generation.
• Four external outputs per timer/counter corresponding to match registers, with the following capabilities:
– Set LOW on match.
– Set HIGH on match.
– Toggle on match.
– Do nothing on match.
Watchdog timer
The purpose of the watchdog is to reset the microcontroller within a reasonable amount of time if it enters an
erroneous state. When enabled, the watchdog will generate a system reset if the user program fails to ‘feed’ (or
reload) the watchdog within a predetermined amount of time.
Features
• Internally resets chip if not periodically reloaded.
• Debug mode.
• Enabled by software but requires a hardware reset or a watchdog reset/interrupt to be disabled.
• Incorrect/Incomplete feed sequence causes reset/interrupt if enabled.
• Flag to indicate watchdog reset.
• Programmable 32-bit timer with internal pre-scaler.
• Selectable time
Real-time clock
The RTC is designed to provide a set of counters to measure time when normal or idle operating mode is selected.
The RTC has been designed to use little power, making it suitable for battery powered systems where the CPU is
not running continuously (Idle mode).
Features
• Measures the passage of time to maintain a calendar and clock.
• Ultra-low power design to support battery powered systems.
• Provides Seconds, Minutes, Hours, Day of Month, Month, Year, Day of Week, and Day of Year.
• Can use either the RTC dedicated 32 kHz oscillator input or clock derived from the external crystal/oscillator
input at XTAL1. Programmable reference clock divider allows fine adjustment of the RTC.
• Dedicated power supply pin can be connected to a battery or the main 3.3 V.
Pulse width modulator
The PWM is based on the standard timer block and inherits all of its features, although only the PWM function is
pinned out on the LPC2148. The timer is designed to count cycles of the peripheral clock (PCLK) and optionally
generate interrupts or perform other actions when specified timer values occur, based on seven match registers.
The PWM function is also based on match register events. The ability to separately control rising and falling edge
locations allows the PWM to be used for more applications. For instance, multi-phase motor control typically
requires three non-overlapping PWM outputs with individual control of all three pulse widths and positions.
Two match registers can be used to provide a single edge controlled PWM output. One match register (MR0)
controls the PWM cycle rate, by resetting the count upon match. The other match register controls the PWM edge
position. Additional single edge controlled PWM outputs require only one match register each, since the
repetition rate is the same for all PWM outputs. Multiple single edge controlled PWM outputs will all have a
rising edge at the beginning of each PWM cycle, when an MR0 match occurs. Three match registers can be used
to provide a PWM output with both edges controlled. Again, the MR0 match register controls the PWM cycle
rate. The other match registers control the two PWM edge positions. Additional double edge controlled PWM
outputs require only two match registers each, since the repetition rate is the same for all PWM outputs. With
double edge controlled PWM outputs, specific match registers control the rising and falling edge of the output.
This allows both positive going PWM pulses (when the rising edge occurs prior to the falling edge), and negative
going PWM pulses (when the falling edge occurs prior to the rising edge).
Features
• Seven match registers allow up to six single edge controlled or three double edge controlled PWM outputs, or a
mix of both types.
• The match registers also allow:
– Continuous operation with optional interrupt generation on match.
– Stop timer on match with optional interrupt generation.
– Reset timer on match with optional interrupt generation.
• Supports single edge controlled and/or double edge controlled PWM outputs. Single edge controlled PWM
outputs all go HIGH at the beginning of each cycle unless the output is a constant LOW. Double edge controlled
PWM outputs can have either edge occur at any position within a cycle. This allows for both positive going and
negative going pulses.
• Pulse period and width can be any number of timer counts. This allows complete flexibility in the trade-off
between resolution and repetition rate. All PWM outputs will occur at the same repetition rate.
• Double edge controlled PWM outputs can be programmed to be either positive going or negative going pulses.
PLL
The PLL accepts an input clock frequency in the range of 10 MHz to 25 MHz. The input frequency is multiplied
up into the range of 10 MHz to 60 MHz with a Current Controlled Oscillator (CCO). The multiplier can be an
integer value from 1 to 32 (in practice, the multiplier value cannot be higher than 6 on this family of
microcontrollers due to the upper frequency limit of the CPU). The CCO operates in the range of 156 MHz to 320
MHz, so there is an additional divider in the loop to keep the CCO within its frequency range while the PLL is
providing the desired output frequency. The output divider may be set to divide by 2, 4, 8, or 16 to produce the
output clock. Since the minimum output divider value is 2, it is insured that the PLL output has a 50 % duty cycle.
The PLL is turned off and bypassed following a chip reset and may be enabled by software. The program must
configure and activate the PLL, wait for the PLL to Lock, then connect to the PLL as a clock source. The PLL
settling time is 100 μs.
Brownout detector
The LPC2148 include 2-stage monitoring of the voltage on the VDD pins. If this voltage falls below 2.9 V, the
BOD asserts an interrupt signal to the VIC. This signal can be enabled for interrupt; if not, software can monitor
the signal by reading dedicated register.
The second stage of low voltage detection asserts reset to inactivate the LPC2148 when the voltage on the VDD
pins falls below 2.6 V. This reset prevents alteration of the flash as operation of the various elements of the chip
would otherwise become unreliable due to low voltage. The BOD circuit maintains this reset down below 1 V, at
which point the POR circuitry maintains the overall reset. Both the 2.9 V and 2.6 V thresholds include some
hysteresis. In normal operation, this hysteresis allows the 2.9 V detection to reliably interrupt, or a
regularly-executed event loop to sense the condition.
Reset and wake-up timer
Reset has two sources on the LPC2148: the RESET pin and watchdog reset. The RESET pin is a Schmitt trigger
input pin with an additional glitch filter. Assertion of chip reset by any source starts the Wake-up Timer (see
Wake-up Timer description below), causing the internal chip reset to remain asserted until the external reset is
de-asserted, the oscillator is running, a fixed number of clocks have passed, and the on-chip flash controller has
completed its initialization. When the internal reset is removed, the processor begins executing at address 0,
which is the reset vector. At that point, all of the processor and peripheral registers have been initialized to
predetermined values. The Wake-up Timer ensures that the oscillator and other analog functions required for chip
operation are fully functional before the processor is allowed to execute instructions. This is important at power
on, all types of reset, and whenever any of the aforementioned functions are turned off for any reason. Since the
oscillator and other functions are turned off during Power-down mode, any wake-up of the processor from
Power-down mode makes use of the Wake-up Timer. The Wake-up Timer monitors the crystal oscillator as the
means of checking whether it is safe to begin code execution. When power is applied to the chip, or some event
caused the chip to exit Power-down mode, some time is required for the oscillator to produce a signal of sufficient
amplitude to drive the clock logic. The amount of time depends on many factors, including the rate of VDD ramp
(in the case of power on), the type of crystal and its electrical characteristics (if a quartz crystal is used), as well as
any other external circuitry (e.g. capacitors), and the characteristics of the oscillator itself under the existing
ambient conditions.
Code security
This feature of the LPC2148 allow an application to control whether it can be debugged or protected from
observation. If after reset on-chip boot loader detects a valid checksum in flash and reads 0x8765 4321 from
address 0x1FC in flash, debugging will be disabled and thus the code in flash will be protected from observation.
Once debugging is disabled, it can be enabled only by performing a full chip erase using the ISP.
External interrupt inputs
The LPC2148 include up to nine edge or level sensitive External Interrupt. Inputs as selectable pin functions.
When the pins are combined, external events can be processed as four independent interrupt signals. The External
Interrupt Inputs can optionally be used to wake-up the processor from Power-down mode. Additionally capture
input pins can also be used as external interrupts without the option to wake the device up from Power-down
mode.
Memory mapping control
The Memory Mapping Control alters the mapping of the interrupt vectors that appear beginning at address
0x0000 0000. Vectors may be mapped to the bottom of the on-chip flash memory, or to the on-chip static RAM.
This allows code running in different memory spaces to have control of the interrupts.
Power control
The LPC2148 supports two reduced power modes: Idle mode and Power-down mode. In Idle mode, execution of
instructions is suspended until either a reset or interrupt occurs. Peripheral functions continue operation during
Idle mode and may generate interrupts to cause the processor to resume execution. Idle mode eliminates power
used by the processor itself, memory systems and related controllers, and internal buses. In Power-down mode,
the oscillator is shut down and the chip receives no internal clocks. The processor state and registers, peripheral
registers, and internal SRAM values are preserved throughout Power-down mode and the logic levels of chip
output pins remain static. The Power-down mode can be terminated and normal operation resumed by either a
reset or certain specific interrupts that are able to function without clocks. Since all dynamic operation of the chip
is suspended, Power-down mode reduces chip power consumption to nearly zero. Selecting an external 32 kHz
clock instead of the PCLK as a clock-source for the on-chip RTC will enable the microcontroller to have the RTC
active during Power-down mode. Power-down current is increased with RTC active. However, it is significantly
lower than in Idle mode. A Power Control for Peripherals feature allows individual peripherals to be turned off if
they are not needed in the application, resulting in additional power savings during active and idle mode.
VPB bus
The VPB divider determines the relationship between the processor clock (CCLK) and the clock used by
peripheral devices (PCLK). The VPB divider serves two purposes. The first is to provide peripherals with the
desired PCLK via VPB bus so that they can operate at the speed chosen for the ARM processor. In order to
achieve this, the VPB bus may be slowed down to 1⁄2 to 1⁄4 of the processor clock rate. Because the VPB bus must
work properly at power-up (and its timing cannot be altered if it does not work since the VPB divider control
registers reside on the VPB bus), the default condition at reset is for the VPB bus to run at 1⁄4 of the processor
clock rate. The second purpose of the VPB divider is to allow power savings when an application does not require
any peripherals to run at the full processor rate. Because the VPB divider is connected to the PLL output, the PLL
remains active (if it was running) during Idle mode.
CH8
LPC 2148 PERIPHERALS
Introduction:

This chapter explains about the LPC2148 peripherals in details. Here we discuss in details about Pin connect
block, different GPIO registers, working of ADC, DAC with example program. We also look into the details of
RTC, PWM, SPI and General purpose timers/external event counters

Pin connect block

Allows individual pin configuration. The purpose of the Pin connect block is to configure the microcontroller pins
to the desired functions.

Description

The pin connect block allows selected pins of the microcontroller to have more than one function. Configuration
registers control the multiplexers to allow connection between the pin and the on chip peripherals.

Peripherals should be connected to the appropriate pins prior to being activated, and prior to any related
interrupt(s) being enabled. Activity of any enabled peripheral function that is not mapped to a related pin should
be considered undefined.

Selection of a single function on a port pin completely excludes all other functions otherwise available on the
same pin.

The only partial exception from the above rule of exclusion is the case of inputs to the A/D converter. Regardless
of the function that is selected for the port pin that also hosts the A/D input, this A/D input can be read at any time
and variations of the voltage level on this pin will be reflected in the A/D readings. However, valid analog
reading(s) can be obtained if and only if the function of an analog input is selected. Only in this case proper
interface circuit is active in between the physical pin and the A/D module. In all other cases, a part of digital logic
necessary for the digital function to be performed will be active, and will disrupt proper behavior of the A/D.
Register description

The Pin Control Module contains 2 registers as shown in Table 36 below.

Reset value reflects the data stored in used bits only. It does not include reserved bits content.

Name Description Access Reset value Address

PINSEL0 Pin function select register 0. Read/Write 0x0000 0000 0xE002 C000

PINSEL1 Pin function select register 1. Read/Write 0x0000 0000 0xE002 C004

PINSEL2 Pin function select register 2. Read/Write - 0xE002 C014

Table 36. Pin connect block register map

Pin function Select register 0 (PINSEL0 - 0xE002 C000)

The PINSEL0 register controls the functions of the pins as per the settings listed in Table 40. The direction control
bit in the IO0DIR register is effective only when the GPIO function is selected for a pin. For other functions,
direction is controlled automatically.
31:30 29:28 27:26 25:24 23:22 21:20 19:18 17:16 15:14 13:12 11:10 9:8 7:6 5:4 3:2 1:0
P0.15 P0.14 P0.13 P0.12 P0.11 P0.10 P0.9 P0.8 P0.7 P0.6 P0.5 P0.4 P0.3 P0.2 P0.1 P0.0

Bit Symbol Value Function Reset Value


00 GPIO P0.0
1.0 P0.0 01 TXD(UART0) 0
10 PWM1
11 Reserved
00 GPIO Port 0.1
3.2 P0.1 01 RXD(UART0) 0
10 PWM3
11 EINT0
5.4 P0.2 00 GPIO Port 0.2 0
01 SCL0(I2C0)
10 Capture0.0(Timer0)
11 Reserved
00 GPIO Port 0.3
7.6 P0.3 01 SDA0(I2C0) 0
10 Match 0.0(Timer 0)
11 ENT1
00 GPIO Port 0.4
9.8 P0.4 01 SCK0(SPIO) 0
10 Capture 0.1(Timer 0)
11 AD0.6
00 GPIO Port 0.5
11:10 P0.5 01 MISO0(SPIO) 0
10 Match 0.1(Timer 0)
11 AD0.7
00 GPIO Port 0.6
13:12 P0.6 01 MISO0(SPIO) 0
10 Capture 0.2 (Timer 0)
11 Reserved or AD1.0
00 GPIO Port 0.7
15:14 P0.7 01 MSEL0(SPIO) 0
10 PWM2
11 EINT2
00 GPIO Port 0.8
17:16 P0.8 01 TXD UART1 0
10 PWM4
11 Reserved or AD1.1
00 GPIO Port 0.9
19:18 P0.9 01 RXD UART1 0
10 PWM6
11 EINT3
00 GPIO Port 0.10
21:20 P0.10 01 RTS (UART1) 0
10 Capture1.0(Timer 1)
11 AD1.2
00 GPIO Port 0.11
23:22 P0.11 01 CTS (UART1) 0
10 Capture1.1(Timer 1)
11 SCL1(I2C1)
00 GPIO Port 0.12
25:24 P0.12 01 DSR (UART1) 0
10 Match1.0(Timer 1)
11 AD1.3
00 GPIO Port 0.13
27:26 P0.13 01 DTR (UART1) 0
10 Match1.1(Timer 1)
11 AD1.4
00 GPIO Port 0.14
29:28 P0.14 01 DCR (UART1) 0
10 EINT1
11 SDA1(I2C1)
00 GPIO Port 0.15
31:30 P0.15 01 RI (UART1) 0
10 EINT2
11 AD1.5

Table 37. Pin function Select register 0 (PINSEL0 - address 0xE002 C000) bit description

Pin function Select register 1 (PINSEL1 - 0xE002 C004)

The PINSEL1 register controls the functions of the pins as per the settings listed in following tables. The direction
control bit in the IO0DIR register is effective only when the GPIO function is selected for a pin. For other
functions direction is controlled automatically.
Bit Symbol Value Function Reset Value
00 GPIO P0.16
1.0 P0.16 01 EINT0 0
10 Capture 0.2 (Timer 0)
11 Match 0.2 (Timer 0)
00 GPIO Port 0.17
3.2 P0.17 01 Capture 1.2 (Timer 1) 0
10 SCK1 (SSP)
11 Match 1.2 (Timer 1)
5.4 P0.18 00 GPIO Port 0.18 0
01 Capture 1.3 (Timer 1)
10 MISO1 (SSP)
11 Match 1.3 (Timer 1)
00 GPIO Port 0.19
7.6 P0.19 01 Match 1.2 (Timer 1) 0
10 MOSI1 (SSP)
11 Capture 1.2 (Timer 1)
00 GPIO Port 0.20
9.8 P0.20 01 Match 1.3 (Timer 1) 0
10 SSEL1 (SSP)
11 EINT3
00 GPIO Port 0.21
11:10 P0.21 01 PWM5 0
10 AD1.6
11 Capture 1.3 (Timer 1)
00 GPIO Port 0.22
13:12 P0.22 01 AD1.7 0
10 Capture 0.0 (Timer 0)
11 Match 0.0 (Timer 0)
00 GPIO Port 0.23
15:14 P0.23 01 VBUS 0
10 Reserved
11 Reserved
00 GPIO Port 0.24
17:16 P0.24 01 Reserved 0
10 Reserved
11 Reserved
00 GPIO Port 0.25
19:18 P0.25 01 AD0.4 0
10 Aout(DAC)
11 Reserved
00 GPIO Port 0.26
21:20 P0.26 01 Reserved 0
10 Reserved
11 Reserved
00 GPIO Port 0.27
23:22 P0.27 01 Reserved 0
10 Reserved
11 Reserved
00 GPIO Port 0.28
25:24 P0.28 01 AD0.1 0
10 Capture 0.2 (Timer 0)
11 Match 0.2 (Timer 0)
00 GPIO Port 0.29
27:26 P0.29 01 AD0.2 0
10 Capture 0.3 (Timer 0)
11 Match 0.3 (Timer 0)
00 GPIO Port 0.30
29:28 P0.30 01 AD0.3 0
10 EINT3
11 Capture 0.0 (Timer 0)
00 GPIO Port 0.31
31:30 P0.31 01 UP_LED 0
10 CONNECT
11 Reserved

Table 38. Pin function Select register 1 (PINSEL1 - address 0xE002 C004) bit description

Pin function Select register 2 (PINSEL2 - 0xE002 C014)

The PINSEL2 register controls the functions of the pins as per the settings listed in Table 39. The direction control
bit in the IO1DIR register is effective only when the GPIO function is selected for a pin. For other functions
direction is controlled automatically.

Note: use read-modify-write operation when accessing PINSEL2 register. Accidental write of 0 to bit 2
and/or bit 3 results in loss of debug and/or trace functionality! Changing of either bit 2 or bit 3 from 1 to 0 may
cause an incorrect code execution!

The Debug modes are entered as follows:


• During reset, if P1.26 is pulled low (weak bias resistor is connected from P1.26 to Vss), JTAG pins will be
available.
• During reset, if P1.20 is pulled low (weak bias resistor is connected from P1.20 to Vss), Trace port will be
available.

Reset value for bit 2 of PINSEL2 register will be inverse of the external state of the P1.26/RTCK. Reset value for
bit 2 will be set to 1 if P1.26/RTCK is externally pulled low and reset value for bit 2 will be set to 0 if there is no
pull-down.
Reset value for bit 3 of PINSEL2 register will be inverse of the external state of the P1.20/TRACESYNC. Reset
value for bit 3 will be set to 1 if P1.20/TRACESYNC IS externally pulled low and reset value for bit 3 will be set
to 0 if there is no pull-down.

Pins P1.31 thru 16 can be determined via hardware pins prior to de-asserting of reset.

Bit Symbol Value Function Reset Value


Reserved, user software should not write
1:0 - - ones to reserved bits. The value read from NA
a reserved bit is not defined.

0 Pins P1.36-26 are used as GPIO pins P1.26/RTCK


2 GPIO/DEBUG 1 Pins P1.36-26 are used as a Debug port.
0 Pins P1.25-16 are used as GPIO pins P1.20/TRACESYNC
3 GPIO/TRACE
1 Pins P1.25-16 are used as a Trace port.

- - Reserved, user software should not write


31:4 ones to reserved bits. The value read from NA
a reserved bit is not defined.

Table 39. Pin function Select register 2 (PINSEL2 - 0xE002 C014) bit description

Pin function select register values

The PINSEL registers control the functions of device pins as shown below. Pairs of bits in these registers
correspond to specific device pins.

PINSEL0 and PINSEL1 Function Value after Reset


Values
00 Primary (default) function, typically GPIO port
01 First alternate function 00
10 Second alternate function
11 Reserved

Table 40. Pin function select register bits

The direction control bit in the IO0DIR/IO1DIR register is effective only when the GPIO function is selected for
a pin. For other functions, direction is controlled automatically. Each derivative typically has a different pinout
and therefore a different set of functions possible for each pin. Details for a specific derivative may be found in the
appropriate data sheet.

Fast General Purpose Parallel I/O (GPIO):

Device pins that are not connected to a specific peripheral function are controlled by the GPIO registers. Pins may
be dynamically configured as inputs or outputs. Separate registers allow setting or clearing any number of outputs
simultaneously. The value of the output register may be read back, as well as the current state of the port pins.

LPC2148 introduce accelerated GPIO functions

• GPIO registers are relocated to the ARM local bus for the fastest possible I/O timing.

• Mask registers allow treating sets of port bits as a group, leaving other bits unchanged.
• All GPIO registers are byte addressable.

• Entire port value can be written in one instruction.

Features
 Every physical GPIO port is accessible via either the group of registers providing an enhanced features and
accelerated port access or the legacy group of registers

 Accelerated GPIO functions:


– GPIO registers are relocated to the ARM local bus so that the fastest possible I/O timing can be
achieved
– Mask registers allow treating sets of port bits as a group, leaving other bits unchanged
– All registers are byte and half-word addressable
– Entire port value can be written in one instruction
 Bit-level set and clear registers allow a single instruction set or clear of any number of bits in one port

 Direction control of individual bits

 All I/O default to inputs after reset

 Backward compatibility with other earlier devices is maintained with legacy registers
 appearing at the original addresses on the APB bus

Applications
• General purpose I/O
• Driving LEDs, or other indicators
• Controlling off-chip devices
• Sensing digital inputs

Pin description

Pin Type Description


P0.0-P0.31 General purpose input/output. The number of GPIOs actually available
Input/ Output depends on the use of alternate functions
P1.16-P1.31

Table 64. GPIO pin description

Register description

LPC2148 has two 32-bit General Purpose I/O ports. Total of 30 input/output and a single output only pin out of 32
pins are available on PORT0. PORT1 has up to 16 pins available for GPIO functions. PORT0 and PORT1 are
controlled via two groups of 4 registers namely

1. IOPIN (port pin value register).


2. IOSET (port output set value register).
3. IODIR (port direction control register).
4. IOCLR (port output clear register).

Function, Accessibility, reset value and port addresses of these registers are shown in Table 65 and Table 66.

The Legacy registers shown in Table 65 allow compatibility to work with earlier family devices, using existing
code. The functions and relative timing of older GPIO implementations is preserved.

The user must select whether a GPIO will be accessed via registers that provide enhanced features or a legacy set
of registers.While both of a port’s fast and legacy GPIO registers are controlling the same physical pins, these two
port control branches are mutually exclusive and operate independently. For example, changing a pin’s output via
a fast register will not be observable via the corresponding legacy register.
Generic Description Access Reset PORT0 PORT1
Name Value Address & Address &
Name Name
IOPIN GPIO Port Pin value register. The current state R/W NA 0xE002 8000 0xE002 8010
of the GPIO configured port pins can always be IO0PIN IO1PIN
read from this register, regardless of pin
direction.
IOSET GPIO Port Output Set register. This register R/W 0 0xE002 8004 0xE002 8014
controls the state of output pins in conjunction IO0SET IO1SET
with the IOCLR register. Writing ones produces
highs at the corresponding port pins. Writing
zeroes has no effect
IODIR GPIO Port Direction control register. This R/W 0 0xE002 8008 0xE002 8018
register individually controls the direction of IO0DIR IO1DIR
each port pin.
IOCLR GPIO Port Output Clear register. This register WO 0 0xE002 800C 0xE002 801C
controls the state of output pins. Writing ones IO0CLR IO1CLR
produces lows at the corresponding port pins
and clears the corresponding bits in the IOSET
register. Writing zeroes has no effect.

Table 65. GPIO register map

LPC2148 consists of five registers with enhanced GPIO features. They are

1. FIOPIN (Fast port pin value register).


2. FIOSET (Fast port output set value register).
3. FIODIR (Fast port direction control register).
4. FIOCLR (Fast port output clear register).
5. FIOMASK (Fast Mask register for Masking).

All of these registers are located directly on the local bus of the CPU for the fastest possible read and write timing.
An additional feature has been added that provides byte addressability of all GPIO registers. A mask register
allows treating groups of bits in a single GPIO port separately from other bits on the same port.

Table 66 represent the enhanced GPIO features of the registers.


Generic Description Access Reset PORT0 PORT1
Name Value Address Address
& Name & Name
FIODIR Fast GPIO Port Direction control register. This R/W 0 0x3FFF C000 0x3FFF C020
register individually controls the direction of FIO0DIR FIO1DIR
each port pin.
FIOMASK Fast Mask register for port. Writes, sets, clears, R/W 0 0x3FFF C010 0x3FFF C030
and reads to port (done via writes to FIOPIN, FIO0MASK FIO1MASK
FIOSET, and FIOCLR, and reads of FIOPIN)
alter or return only the bits loaded with zero in
this register.
FIOPIN Fast Port Pin value register using FIOMASK R/W NA 0x3FFF C014 0x3FFF C034
The current state of digital port pins can be read FIO0PIN FIO1PIN
from this register, regardless of pin direction or
alternate function selection (as long as pin is not
configured as an input to ADC). The value read
is value of the physical pins masked by ANDing
the inverted FIOMASK. Writing to this register
affects only port bits enabled by ZEROES in
FIOMASK
FIOSET Fast Port Output Set register using FIOMASK. R/W 0 0x3FFF C018 0x3FFF C038
This register controls the state of output pins. FIO0SET FIO1SET
Writing 1s produces highs at the corresponding
port pins. Writing 0s has no effect. Reading this
register returns the current contents of the port
output register. Only bits enabled by ZEROES in
FIOMASK can be altered.
FIOCLR Fast Port Output Clear register using FIOMASK. WO 0 0x3FFFC01C 0x3FFF C03C
This register controls the state of output pins. FIO0CLR FIO1CLR
Writing 1s produces lows at the corresponding
port pins. Writing 0s has no effect. Only bits
enabled by ZEROES in FIOMASK can be
altered..

Table 66. GPIO register map (local bus accessible registers - enhanced GPIO features)

GPIO port Direction register:

This word accessible register is used to control the direction of the pins when they are configured as GPIO port
pins. Direction bit for any pin must be set according to the pin functionality. LPC 2148 consists of two Legacy
registers namely IO0DIR and IO1DIR, and two enhanced GPIO function registers namely FIO0DIR and
FIO1DIR registers

Port 0 : IO0DIR (GPIO port 0 Direction register).

Port0 consists of 32 bits named as 31:0.Thease bits are used Slow GPIO direction control.P0.0 is controlled by
bit0, P0.1 is controlled by bit1, and P0.2 is controlled by bit 2 and so on.

If 0 is written to these bits they are configured as input ports.

If 1 is written to these bits they are configured as output ports.

This port is symbolized as P0xDIR.Address of this port is 0xE002 8008. Upon reset this port will have a value
0x0000 0000

Port 1 : IO1DIR (GPIO port 1 Direction register).

Port1 consists of 32 bits named as 31:0.Thease bits are used Slow GPIO direction control.P1.0 is controlled by
bit0, P1.1 is controlled by bit1, and P1.2 is controlled by bit 2 and so on.

If 0 is written to these bits they are configured as input ports

If 1 is written to these bits they are configured as output ports.

This port is symbolized as P1xDIR.Address of this port is 0xE002 8018. Upon reset this port will have a value
0x0000 0000

Port 0 : FIO0DIR (Fast GPIO port 0 Direction register).

Port0 consists of 32 bits named as 31:0.Thease bits are used for Fast GPIO direction control. P0.0 is controlled by
bit0, P0.1 is controlled by bit1, and P0.2 is controlled by bit 2 and so on.

If 0 is written to these bits they are configured as input ports

If 1 is written to these bits they are configured as output ports.


This port is symbolized as FP0xDIR.Address of this port is 0x3FFF C000. Upon reset this port will have a value
0x0000 0000

Port 1 : FIO1DIR (Fast GPIO port 1 Direction register).

Port1 consists of 32 bits named as 31:0.Thease bits are used for Fast GPIO direction control. P1.0 is controlled by
bit0, P1.1 is controlled by bit1, and P1.2 is controlled by bit 2 and so on.
If 0 is written to these bits they are configured as input ports.

If 1 is written to these bits they are configured as output ports.

This port is symbolized as FP1xDIR.Address of this port is 0x3FFF C020. Upon reset this port will have a value
0x0000 0000

Every fast GPIO port can also be controlled via several byte and half-word accessible registers rather than the
32-bit long and word only accessible FIODIR register, and these are listed in Table 71 and Table 72. These
additional registers allow easier and faster access to the physical port pins along with providing the same
functions as the FIODIR register.

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO0DIR0 8(Byte) 0x3FFF C000 Fast GPIO Port 0 Direction control register 0. Bit 0x00
0 in FIO0DIR0, register corresponds to P0.0 ... bit
7 to P0.7.
FIO0DIR1 8(Byte) 0x3FFF C001 Fast GPIO Port 0 Direction control register 1. Bit 0x00
0 in FIO0DIR1, register corresponds to P0.8 ... bit
7 to P0.15.
FIO0DIR2 8(Byte) 0x3FFF C002 Fast GPIO Port 0 Direction control register 2. Bit 0x00
0 in FIO0DIR2, register corresponds to P0.16 ...
bit 7 to P0.23.
FIO0DIR3 8(Byte) 0x3FFF C003 Fast GPIO Port 0 Direction control register 3. Bit 0x00
0 in FIO0DIR3, register corresponds to P0.24 ...
bit 7 to P0.31.
FIO0DIRL 16(Half-Word) 0x3FFF C000 Fast GPIO Port 0 Direction control Lower half 0x00
word register. Bit 0 in FIO0DIRL, register
corresponds to P0.0 ... bit15 to P0.15.
FIO0DIRU 16(Half-Word) 0x3FFF C002 Fast GPIO Port 0 Direction control upper half 0x00
word register. Bit 0 in FIO0DIRU, register
corresponds to P0.16 ... bit 7 to P0.31.

Table 71. Fast GPIO port 0 Direction control byte and half-word accessible register description

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO1DIR0 8(Byte) 0x3FFF C020 Fast GPIO Port 1 Direction control register 0. Bit 0x00
0 in FIO1DIR0, register corresponds to P0.0 ... bit
7 to P0.7.
FIO1DIR1 8(Byte) 0x3FFF C021 Fast GPIO Port 1 Direction control register 1. Bit 0x00
0 in FIO1DIR1, register corresponds to P0.8 ... bit
7 to P0.15.
FIO1DIR2 8(Byte) 0x3FFF C022 Fast GPIO Port 1 Direction control register 2. Bit 0x00
0 in FIO1DIR2, register corresponds to P0.16 ...
bit 7 to P0.23.
FIO1DIR3 8(Byte) 0x3FFF C023 Fast GPIO Port 1 Direction control register 3. Bit 0x00
0 in FIO1DIR3, register corresponds to P0.24 ...
bit 7 to P0.31.
FIO1DIRL 16(Half-Word) 0x3FFF C020 Fast GPIO Port 1 Direction control Lower half 0x00
word register. Bit 0 in FIO1DIRL, register
corresponds to P0.0 ... bit15 to P0.15.
FIO1DIRU 16(Half-Word) 0x3FFF C022 Fast GPIO Port 1 Direction control Upper half 0x00
word register. Bit 0 in FIO1DIRU, register
corresponds to P0.16 ... bit 7 to P0.31.

Table 72. Fast GPIO port 1 Direction control byte and half-word accessible register description

GPIO port Mask register:

This register is available in the enhanced group of registers only. It is used to select ports pins that will and will not
be affected by a write accesses to the FIOPIN, FIOSET or FIOSLR register. Mask register also filters out port’s
content when the FIOPIN register is read.
A zero in this register’s bit enables an access to the corresponding physical pin via a read or write access. If a bit
in this register is one, corresponding pin will not be changed with write access and if read, will not be reflected in
the updated FIOPIN register.

LPC 2148 consists of two Mask registers namely FIO0MASK and FIO1MASK.

Port 0 : FIO0MASK (Fast GPIO port 0 Mask register).

Port0 consists of 32 bits named as 31:0.Thease bits are used Fast GPIO physical pin access control.

If 0 is written to these bits Pin is affected by writes to the FIOSET, FIOCLR, and FIOPIN registers. Current state
of the pin will be observable in the FIOPIN register.

If 1 is written to these bits Physical pin is unaffected by writes into the FIOSET, FIOCLR and FIOPIN registers.
When the FIOPIN register is read, this bit will not be updated with the state of the physical pin.

This port is symbolized as FP0xMASK.Address of this port is 0x03FF C010. Upon reset this port will have a
value 0x0000 0000

Port 1 : FIO1MASK (Fast GPIO port 1 Mask register).

Port0 consists of 32 bits named as 31:0.Thease bits are used Fast GPIO physical pin access control.

If 0 is written to these bits Pin is affected by writes to the FIOSET, FIOCLR, and FIOPIN registers. Current state
of the pin will be observable in the FIOPIN register.

If 1 is written to these bits Physical pin is unaffected by writes into the FIOSET, FIOCLR and FIOPIN registers.
When the FIOPIN register is read, this bit will not be updated with the state of the physical pin.

This port is symbolized as FP1xMASK.Address of this port is 0x03FF C030. Upon reset this port will have a
value 0x0000 0000

Every fast GPIO port can also be controlled via several byte and half-word accessible registers rather than the
32-bit long and word only accessible FIOMASK register, and these are listed in Table 75 and Table 76. These
additional registers allow easier and faster access to the physical port pins along with providing the same
functions as the FIOMASK register.

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO0MASK0 8(Byte) 0x3FFF C010 Fast GPIO Port 0 Mask register 0. Bit 0 in 0x00
FIO0MASK0, register corresponds to P0.0 ... bit 7
to P0.7.
FIO0MASK1 8(Byte) 0x3FFF C011 Fast GPIO Port 0 Mask register 1. Bit 0 in 0x00
FIO0MASK 1, register corresponds to P0.8 ... bit
7 to P0.15.
FIO0MASK2 8(Byte) 0x3FFF C012 Fast GPIO Port 0 Mask register 2. Bit 0 in 0x00
FIO0MASK 2, register corresponds to P0.16 ... bit
7 to P0.23.
FIO0MASK3 8(Byte) 0x3FFF C013 Fast GPIO Port 0 Mask register 3. Bit 0 in 0x00
FIO0MASK 3, register corresponds to P0.24 ... bit
7 to P0.31.
FIO0MASKL 16(Half-Word) 0x3FFF C010 Fast GPIO Port 0 Mask Lower half- word register. 0x0000
Bit 0 in FIO0MASKL, register corresponds to
P0.0 ... bit15 to P0.15.
FIO0MASKU 16(Half-Word) 0x3FFF C012 Fast GPIO Port 0 Mask upper half word register. 0x0000
Bit 0 in FIO0MASKU, register corresponds to
P0.16 ... bit 7 to P0.31.

Table 75. Fast GPIO port 0 Mask byte and half-word accessible register description

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO1MASK0 8(Byte) 0x3FFF C030 Fast GPIO Port 1 Mask register 0. Bit 0 in 0x00
FIO1MASK0, register corresponds to P0.0 ... bit 7
to P0.7.
FIO1MASK1 8(Byte) 0x3FFF C031 Fast GPIO Port 1 Mask register 1. Bit 0 in 0x00
FIO1MASK 1, register corresponds to P0.8 ... bit
7 to P0.15.
FIO1MASK2 8(Byte) 0x3FFF C032 Fast GPIO Port 1 Mask register 2. Bit 0 in 0x00
FIO1MASK 2, register corresponds to P0.16 ... bit
7 to P0.23.
FIO1MASK3 8(Byte) 0x3FFF C033 Fast GPIO Port 1 Mask register 3. Bit 0 in 0x00
FIO1MASK 3, register corresponds to P0.24 ... bit
7 to P0.31.
FIO1MASKL 16(Half-Word) 0x3FFF C030 Fast GPIO Port 1 Mask Lower half- word register. 0x0000
Bit 0 in FIO1MASKL, register corresponds to
P0.0 ... bit15 to P0.15.
FIO1MASKU 16(Half-Word) 0x3FFF C032 Fast GPIO Port 1 Mask upper half word register. 0x0000
Bit 0 in FIO1MASKU, register corresponds to
P0.16 ... bit 7 to P0.31.

Table 76. Fast GPIO port 0 Mask byte and half-word accessible register description

GPIO port Pin value register:

This register provides the value of port pins that are configured to perform only digital functions. The register will
give the logic value of the pin regardless of whether the pin is configured for input or output, or as GPIO or an
alternate digital function. As an example, a particular port pin may have GPIO input, GPIO output, UART
receive, and PWM output as selectable functions. Any configuration of that pin will allow its current logic state to
be read from the IOPIN register.

If a pin has an analog function as one of its options, the pin state cannot be read if the analog configuration is
selected. Selecting the pin as an A/D input disconnects the digital features of the pin. In that case, the pin value
read in the IOPIN register is not valid. Writing to the IOPIN register stores the value in the port output register,
bypassing the need to use both the IOSET and IOCLR registers to obtain the entire written value. This feature
should be used carefully in an application since it affects the entire port.

LPC 2148 consists of two Legacy registers namely IO0PIN and IO1PIN, and two enhanced GPIO function
registers namely FIO0PIN and FIO1PIN registers
Access to a port pins via the FIOPIN register is conditioned by the corresponding FIOMASK register Only pins
masked with zeros in the Mask register will be correlated to the current content of the Fast GPIO port pin value
register.

Port 0 : IO0PIN ( Slow GPIO port 0 PIN register).

Port0 consists of 32 bits named as 31:0.Thease bits configured to perform only digital functions.

This port is symbolized as P0xVAL.Address of this port is 0XE002 8000.

Port 1 : IO1PIN ( Slow GPIO port 1 PIN register).

Port0 consists of 32 bits named as 31:0.Thease bits configured to perform only digital functions.

This port is symbolized as P1xVAL.Address of this port is 0XE002 8010.

Port 0 : FIO0PIN ( Fast GPIO port 0 PIN register).

Port0 consists of 32 bits named as 31:0.Thease bits configured to perform only digital functions.

This port is symbolized as FP0xVAL.Address of this port is 0X3FFF C014.

Port 1 : FIO1PIN ( Fast GPIO port 1 PIN register).

Port0 consists of 32 bits named as 31:0.Thease bits configured to perform only digital functions.

This port is symbolized as FP1xVAL.Address of this port is 0X3FFF C034.

Every fast GPIO port can also be controlled via several byte and half-word accessible registers rather than the
32-bit long and word only accessible FIOPIN register, and these are listed in Table 81 and Table 82. These
additional registers allow easier and faster access to the physical port pins along with providing the same
functions as the FIOPIN register.

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO0PIN0 8(Byte) 0x3FFF C014 Fast GPIO Port 0 PIN register 0. Bit 0 in 0x00
FIO0PIN0, register corresponds to P0.0 ... bit 7 to
P0.7.
FIO0PIN1 8(Byte) 0x3FFF C015 Fast GPIO Port 0 PIN register 1. Bit 0 in FIO0 0x00
PIN1, register corresponds to P0.8 ... bit 7 to
P0.15.
FIO0PIN2 8(Byte) 0x3FFF C016 Fast GPIO Port 0 PIN register 2. Bit 0 in FIO0 0x00
PIN2, register corresponds to P0.16 ... bit 7 to
P0.23.
FIO0PIN3 8(Byte) 0x3FFF C017 Fast GPIO Port 0 PIN register 3. Bit 0 in FIO0 0x00
PIN3, register corresponds to P0.24 ... bit 7 to
P0.31.
FIO0PINL 16(Half-Word) 0x3FFF C014 Fast GPIO Port 0 PIN Lower half- word register. 0x0000
Bit 0 in FIO0PINL, register corresponds to P0.0 ...
bit15 to P0.15.
FIO0PINU 16(Half-Word) 0x3FFF C016 Fast GPIO Port 0 PIN upper half word register. 0x0000
Bit 0 in FIO0PINU, register corresponds to P0.16
... bit 7 to P0.31.
Table 81. Fast GPIO port 0 Pin value byte and half-word accessible register description

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO1PIN0 8(Byte) 0x3FFF C034 Fast GPIO Port 1 PIN register 0. Bit 0 in 0x00
FIO1PIN0, register corresponds to P0.0 ... bit 7 to
P0.7.
FIO1PIN1 8(Byte) 0x3FFF C035 Fast GPIO Port 1 PIN register 1. Bit 0 in FIO1 0x00
PIN1, register corresponds to P0.8 ... bit 7 to
P0.15.
FIO1PIN2 8(Byte) 0x3FFF C036 Fast GPIO Port 1 PIN register 2. Bit 0 in FIO0 0x00
PIN2, register corresponds to P0.16 ... bit 7 to
P0.23.
FIO1PIN3 8(Byte) 0x3FFF C037 Fast GPIO Port 1 PIN register 3. Bit 0 in FIO0 0x00
PIN3, register corresponds to P0.24 ... bit 7 to
P0.31.
FIO1PINL 16(Half-Word) 0x3FFF C034 Fast GPIO Port 0 PIN Lower half- word register. 0x0000
Bit 0 in FIO0PINL, register corresponds to P0.0 ...
bit15 to P0.15.
FIO0PINU 16(Half-Word) 0x3FFF C036 Fast GPIO Port 0 PIN upper half word register. 0x0000
Bit 0 in FIO0PINU, register corresponds to P0.16
... bit 7 to P0.31.

Table 82. Fast GPIO port 1 Pin value byte and half-word accessible register description

GPIO port output Set register

In order to produces a HIGH level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.

If any pin is configured as an input or a secondary function, writing 1 to the corresponding bit in the IOSET has no
effect.

Reading the IOSET register returns the value of this register, as determined by previous writes to IOSET and
IOCLR (or IOPIN as noted above). This value does not reflect the effect of any outside world influence on the I/O
pins.

LPC 2148 consists of two Legacy registers namely IO0SET and IO1SET, and two enhanced GPIO function
registers namely FIO0SET and FIO1SET registers. Access to a port pins via the FIOSET register is conditioned
by the corresponding FIOMASK register.

Port 0 : IO0SET (GPIO port 0 Output set register).

Port0 consists of 32 bits named as 31:0.Thease bits are used Slow GPIO direction control.P0.0 is controlled by
bit0, P0.1 is controlled by bit1, and P0.2 is controlled by bit 2 and so on.

In order to produces a HIGH level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.


This port is symbolized as P0xSET.Address of this port is 0xE002 8004. Upon reset this port will have a value
0x0000 0000

Port 1 : IO1SET (GPIO port 1 Output set register).

Port1 consists of 32 bits named as 31:0.Thease bits are used Slow GPIO direction control.P1.0 is controlled by
bit0, P1.1 is controlled by bit1, and P1.2 is controlled by bit 2 and so on.

In order to produces a HIGH level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.

This port is symbolized as P1xSET.Address of this port is 0xE002 8014. Upon reset this port will have a value
0x0000 0000

Port 0 : FIO0SET (Fast GPIO port 0 Output set register).

Port0 consists of 32 bits named as 31:0.Thease bits are used for Fast GPIO direction control. P0.0 is controlled by
bit0, P0.1 is controlled by bit1, and P0.2 is controlled by bit 2 and so on.

In order to produces a HIGH level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.

This port is symbolized as FP0xSET.Address of this port is 0x3FFF C018. Upon reset this port will have a value
0x0000 0000

Port 1 : FIO1SET (Fast GPIO port 1 Output set register).

Port1 consists of 32 bits named as 31:0.Thease bits are used for Fast GPIO direction control. P1.0 is controlled by
bit0, P1.1 is controlled by bit1, and P1.2 is controlled by bit 2 and so on.

In order to produces a HIGH level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.

This port is symbolized as FP1xSET.Address of this port is 0x3FFF C038. Upon reset this port will have a value
0x0000 0000
Every fast GPIO port can also be controlled via several byte and half-word accessible registers rather than the
32-bit long and word only accessible FIOSET register, and these are listed in Table 87 and Table 88. These
additional registers allow easier and faster access to the physical port pins along with providing the same
functions as the FIOSET register.

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO0SETN0 8(Byte) 0x3FFF C018 Fast GPIO Port 0 SET register 0. Bit 0 in 0x00
FIO0SET0, register corresponds to P0.0 ... bit 7 to
P0.7.
FIO0SET1 8(Byte) 0x3FFF C019 Fast GPIO Port 0 SET register 1. Bit 0 in FIO0 0x00
SET1, register corresponds to P0.8 ... bit 7 to
P0.15.
FIO0SET2 8(Byte) 0x3FFF C01A Fast GPIO Port 0 SET register 2. Bit 0 in FIO0 0x00
SET2, register corresponds to P0.16 ... bit 7 to
P0.23.
FIO0SET3 8(Byte) 0x3FFF C01B Fast GPIO Port 0 SET register 3. Bit 0 in FIO0 0x00
SET3, register corresponds to P0.24 ... bit 7 to
P0.31.
FIO0SETL 16(Half-Word) 0x3FFF C018 Fast GPIO Port 0 SETLower half- word register. 0x0000
Bit 0 in FIO0SETL, register corresponds to P0.0
... bit15 to P0.15.
FIO0SETU 16(Half-Word) 0x3FFF C01A Fast GPIO Port 0 SET upper half word register. 0x0000
Bit 0 in FIO0SETU, register corresponds to P0.16
... bit 7 to P0.31.

Table 87. Fast GPIO port 0 output Set byte and half-word accessible register description

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO1SETN0 8(Byte) 0x3FFF C038 Fast GPIO Port 1 SET register 0. Bit 0 in 0x00
FIO3SET0, register corresponds to P0.0 ... bit 7 to
P0.7.
FIO1SET1 8(Byte) 0x3FFF C039 Fast GPIO Port 1 SET register 1. Bit 0 in FIO1 0x00
SET1, register corresponds to P0.8 ... bit 7 to
P0.15.
FIO1SET2 8(Byte) 0x3FFF C03A Fast GPIO Port 1 SET register 2. Bit 0 in FIO1 0x00
SET2, register corresponds to P0.16 ... bit 7 to
P0.23.
FIO1SET3 8(Byte) 0x3FFF C03B Fast GPIO Port 1 SET register 3. Bit 0 in FIO1 0x00
SET3, register corresponds to P0.24 ... bit 7 to
P0.31.
FIO1SETL 16(Half-Word) 0x3FFF C038 Fast GPIO Port 1 SETLower half- word register. 0x0000
Bit 0 in FIO1SETL, register corresponds to P0.0
... bit15 to P0.15.
FIO1SETU 16(Half-Word) 0x3FFF C03A Fast GPIO Port 1 SET upper half word register. 0x0000
Bit 0 in FIO1SETU, register corresponds to P0.16
... bit 7 to P0.31.

Table 88. Fast GPIO port 1 output Set byte and half-word accessible register description

GPIO port output Clear register

This register is used to produce a LOW level output at port pins configured as GPIO in an
OUTPUT mode.

In order to produces a LOW level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.

If any pin is configured as an input or a secondary function, writing to IOCLR has no effect

Reading the IOSET register returns the value of this register, as determined by previous writes to IOSET and
IOCLR (or IOPIN as noted above). This value does not reflect the effect of any outside world influence on the I/O
pins.

LPC 2148 consists of two Legacy registers namely IO0SET and IO1SET, and two enhanced GPIO function
registers namely FIO0SET and FIO1SET registers. Access to a port pins via the FIOSET register is conditioned
by the corresponding FIOMASK register.

Port 0 : IO0CLR (GPIO port 0 Output Clear register).

Port0 consists of 32 bits named as 31:0.Thease bits are used Slow GPIO direction control.P0.0 is controlled by
bit0, P0.1 is controlled by bit1, and P0.2 is controlled by bit 2 and so on.

In order to produces a LOW level at the corresponding port pins we need to write 1 to the corresponding pins.
Writing 0 has no effect.

This port is symbolized as P0xCLR.Address of this port is 0xE002 800C. Upon reset this port will have a value
0x0000 0000

Port 1 : IO1CLR (GPIO port 1 Output Clear register).


Port1 consists of 32 bits named as 31:0.Thease bits are used Slow GPIO direction control.P1.0 is controlled by
bit0, P1.1 is controlled by bit1, and P1.2 is controlled by bit 2 and so on.

In order to produces a Low level at the corresponding port pins we need to write 1 to the corresponding pins.
Writing 0 has no effect.

This port is symbolized as P1xCLR.Address of this port is 0xE002 801C. Upon reset this port will have a value
0x0000 0000

Port 0 : FIO0CLR (Fast GPIO port 0 Output Clear register).

Port0 consists of 32 bits named as 31:0.Thease bits are used for Fast GPIO direction control. P0.0 is controlled by
bit0, P0.1 is controlled by bit1, and P0.2 is controlled by bit 2 and so on.

In order to produces a LOW level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.

This port is symbolized as FP0xSET.Address of this port is 0x3FFF C01C. Upon reset this port will have a value
0x0000 0000

Port 1 : FIO1CLR (Fast GPIO port 1 Output Clear register).

Port1 consists of 32 bits named as 31:0.Thease bits are used for Fast GPIO direction control. P1.0 is controlled by
bit0, P1.1 is controlled by bit1, and P1.2 is controlled by bit 2 and so on.

In order to produces a LOW level at the corresponding port pins we need to write 1 to the corresponding pins.

Writing 0 has no effect.

This port is symbolized as FP1xCLR.Address of this port is 0x3FFF C03C. Upon reset this port will have a value
0x0000 0000

Every fast GPIO port can also be controlled via several byte and half-word accessible registers rather than the
32-bit long and word only accessible FIOCLR register, and these are listed in Table 93 and Table 94. These
additional registers allow easier and faster access to the physical port pins along with providing the same
functions as the FIOSET register.

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO0CLR0 8(Byte) 0x3FFF C01C Fast GPIO Port 0 CLR register 0. Bit 0 in FIO0 0x00
CLR 0, register corresponds to P0.0 ... bit 7 to
P0.7.
FIO0 CLR1 8(Byte) 0x3FFF C01D Fast GPIO Port 0 CLR register 1. Bit 0 in FIO0 0x00
CLR 1, register corresponds to P0.8 ... bit 7 to
P0.15.
FIO0CLR 2 8(Byte) 0x3FFF C01E Fast GPIO Port 0 CLR register 2. Bit 0 in FIO0 0x00
CLR 2, register corresponds to P0.16 ... bit 7 to
P0.23.
FIO0CLR3 8(Byte) 0x3FFF C01F Fast GPIO Port 0 CLR register 3. Bit 0 in FIO0 0x00
CLR 3, register corresponds to P0.24 ... bit 7 to
P0.31.
FIO0CLRL 16(Half-Word) 0x3FFF C01C Fast GPIO Port 0 CLR Lower half- word register. 0x0000
Bit 0 in FIO0 CLR L, register corresponds to P0.0
... bit15 to P0.15.
FIO0CLRU 16(Half-Word) 0x3FFF C01E Fast GPIO Port 0 CLR upper half word register. 0x0000
Bit 0 in FIO0 CLR U, register corresponds to
P0.16 ... bit 7 to P0.31.

Table 93. Fast GPIO port 0 output Set byte and half-word accessible register description

Register
Register Reset
Length(bits) Address Description
Name value
& Access
FIO1CLRN0 8(Byte) 0x3FFF C03C Fast GPIO Port 1 CLR register 0. Bit 0 in FIO3 0x00
CLR 0, register corresponds to P0.0 ... bit 7 to
P0.7.
FIO1CLR1 8(Byte) 0x3FFF C03E Fast GPIO Port 1 CLR register 1. Bit 0 in FIO1 0x00
CLR1, register corresponds to P0.8 ... bit 7 to
P0.15.
FIO1CLR2 8(Byte) 0x3FFF C03D Fast GPIO Port 1 CLR register 2. Bit 0 in 0x00
FIO1CLR2, register corresponds to P0.16 ... bit 7
to P0.23.
FIO1CLR3 8(Byte) 0x3FFF C03F Fast GPIO Port 1 CLR register 3. Bit 0 in 0x00
FIO1CLR3, register corresponds to P0.24 ... bit 7
to P0.31.
FIO1CLRL 16(Half-Word) 0x3FFF C03C Fast GPIO Port 1 CLR Lower half- word register. 0x0000
Bit 0 in FIO1CLRL, register corresponds to P0.0
... bit15 to P0.15.
FIO1CLRU 16(Half-Word) 0x3FFF C03E Fast GPIO Port 1 CLR upper half word register. 0x0000
Bit 0 in FIO1CLRU, register corresponds to P0.16
... bit 7 to P0.31.

Table 94. Fast GPIO port 0 output Set byte and half-word accessible register description

Example 1: sequential accesses to IOSET and IOCLR affecting the same GPIO pin/bit

State of the output configured GPIO pin is determined by writes into the pin’s port IOSET and IOCLR registers.
Last of these accesses to the IOSET/IOCLR register will determine the final output of a pin.
In case of a code:

IO0DIR = 0x0000 0080 ; pin P0.7 configured as output


IO0CLR = 0x0000 0080 ; P0.7 goes LOW
IO0SET = 0x0000 0080 ; P0.7 goes HIGH
IO0CLR = 0x0000 0080 ; P0.7 goes LOW

P0.7 is configured as an output (write to IO0DIR register). After this, P0.7 output is set to low (first write to
IO0CLR register). Short high pulse follows on P0.7 (write access to IO0SET), and the final write to IO0CLR
register sets pin P0.7 back to low level.

Example 2: an immediate output of 0s and 1s on a GPIO port

Write access to port’s IOSET followed by write to the IOCLR register results with pins outputting 0s being
slightly later then pins outputting 1s. There are systems that can tolerate this delay of a valid output, but for some
applications simultaneous output of a binary content (mixed 0s and 1s) within a group of pins on a single GPIO
port is required. This can be accomplished by writing to the port’s IOPIN register.

Following code will preserve existing output on PORT0 pins P0. [31:16] and P0. [7:0] and at the same time set
P0.[15:8] to 0xA5, regardless of the previous value of pins P0.[15:8]:
IO0PIN = (IO0PIN && 0xFFFF00FF) || 0x0000A500

The same outcome can be obtained using the fast port access.

Solution 1: using 32-bit (word) accessible fast GPIO registers

FIO0MASK = 0xFFFF00FF;
FIO0PIN = 0x0000A500;

Solution 2: using 16-bit (half-word) accessible fast GPIO registers

FIO0MASKL = 0x00FF;
FIO0PINL = 0xA500;

Solution 3: using 8-bit (byte) accessible fast GPIO registers

FIO0PIN1 = 0xA5;

Writing to IOSET/IOCLR .vs. IOPIN


Write to the IOSET/IOCLR register allows easy change of the port’s selected output pin(s) to high/low level at a
time. Only pin/bit(s) in the IOSET/IOCLR written with 1 will be set to high/low level, while those written as 0
will remain unaffected. However, by just writing to either IOSET or IOCLR register it is not possible to
instantaneously output arbitrary binary data containing mixture of 0s and 1s on a GPIO port.

Write to the IOPIN register enables instantaneous output of a desired content on the parallel GPIO. Binary data
written into the IOPIN register will affect all output configured pins of that parallel port: 0s in the IOPIN will
produce low level pin outputs and 1s in IOPIN will produce high level pin outputs. In order to change output of
only a group of port’s pins, application must logically AND readout from the IOPIN with mask containing 0s in
bits corresponding to pins that will be changed and 1s for all others. Finally, this result has to be logically ORed
with the desired content and stored back into the IOPIN register. Example 2 from above illustrates output of 0xA5
on PORT0 pins 15 to 8 while preserving all other PORT0 output pins as they were before.

Output signal frequency considerations when using the legacy and


enhanced GPIO registers

The enhanced features of the fast GPIO ports available on this microcontroller make GPIO pins more responsive
to the code that has task of controlling them. In particular, software access to a GPIO pin is 3.5 times faster via the
fast GPIO registers than it is when the legacy set of registers is used. As a result of the access speed increase, the
maximum output frequency of the digital pin is increased 3.5 times, too. This tremendous increase of the output
frequency is not always that visible when a plain C code is used, and a portion of an application handling the fast
port output might have to be written in an assembly code and executed in the ARM mode.

Here is a code where the pin control section is written in assembly language for ARM. It illustrates the difference
between the fast and slow GPIO port output capabilities.
Execution from the on-chip SRAM is independent from the MAM setup.

Ldr r0, =0xe01fc1a0 /*register address--enable fast port*/


Mov r1, #0x1
Str r1, [r0] /*enable fast port0*/
ldr r1, =0xffffffff
ldr r0, =0x3fffc000 /*direction of fast port0*/
str r1, [r0]
ldr r0, =0xe0028018 /*direction of slow port 1*/
str r1, [r0]
ldr r0, =0x3fffc018 /*FIO0SET -- fast port0 register*/
ldr r1, =0x3fffc01c /*FIO0CLR0 -- fast port0 register*/
ldr r2, =0xC0010000 /*select fast port 0.16 for toggle*/
ldr r3, =0xE0028014 /*IO1SET -- slow port1 register*/
ldr r4, =0xE002801C /*IO1CLR -- slow port1 register*/
ldr r5, =0x00100000 /*select slow port 1.20 for toggle*/
/*Generate 2 pulses on the fast port*/
str r2, [r0]
str r2, [r1]
str r2, [r0]
str r2, [r1]
/*Generate 2 pulses on the slow port*/
str r5, [r3]
str r5, [r4]
str r5, [r3]
str r5, [r4]
loop: b loop

Serial Peripheral Interface (SPI):

The LPC2148 each contain one SPI controller. The SPI is a full duplex serial interface, designed to handle
multiple masters and slaves connected to a given bus. Only a single master and a single slave can communicate on
the interface during a given data transfer. During a data transfer the master always sends a byte of data to the
slave, and the slave always sends a byte of data to the master.

Features

• Single complete and independent SPI controller.


• Compliant with Serial Peripheral Interface (SPI) specification.
• Synchronous, Serial, Full Duplex Communication.
• Combined SPI master and slave.
• Maximum data bit rate of one eighth of the input clock rate.
• 8 to 16 bits per transfer

SPI data transfers

Figure 30 is a timing diagram that illustrates the four different data transfer formats that are available with the SPI.
This timing diagram illustrates a single 8 bit data transfer. The first thing you should notice in this timing diagram
is that it is divided into three horizontal parts. The first part describes the SCK and SSEL signals. The second part
describes the MOSI and MISO signals when the CPHA variable is 0. The third part describes the MOSI and
MISO signals when the CPHA variable is 1.

In the first part of the timing diagram, note two points. First, the SPI is illustrated with CPOL set to both 0 and 1.
The second point to note is the activation and de-activation of the SSEL signal. When CPHA = 0, the SSEL signal
will always go inactive between data transfers. This is not guaranteed when CPHA = 1 (the signal can remain
active).

Table 197. SPI data to clock phase relationship

CPOL CPHA Firsta data driven Other data driven Data sampled
0 0 Prior to first SCK rising edge SCK falling edge SCK rising edge
0 1 First SCK rising edge SCK rising edge SCK falling edge
1 0 Prior to first SCK falling edge SCK rising edge SCK falling edge
1 1 First SCK falling edge SCK falling edge SCK rising edge

The data and clock phase relationships are summarized in Table 197. This table
summarizes the following for each setting of CPOL and CPHA.

• When the first data bit is driven


• When all other data bits are driven
• When data is sampled
The definition of when an 8 bit transfer starts and stops is dependent on whether a device is a master or a slave,
and the setting of the CPHA variable.

When a device is a master, the start of a transfer is indicated by the master having a byte of data that is ready to be
transmitted. At this point, the master can activate the clock, and begin the transfer. The transfer ends when the last
clock cycle of the transfer is complete.

SPI Registers:

There are four registers that control the SPI peripheral.

SPI control register: The SPI control register contains a number of programmable bits used to control the
function of the SPI block. The settings for this register must be set up prior to a given data transfer taking place.

SPI status register: The SPI status register contains read only bits that are used to monitor the status of the SPI
interface, including normal functions, and exception conditions. The primary purpose of this register is to detect
completion of a data transfer. This is indicated by the SPIF bit. The remaining bits in the register are exception
condition indicators. These exceptions will be described later in this section.

SPI data register: The SPI data register is used to provide the transmit and receive data bytes. An internal shift
register in the SPI block logic is used for the actual transmission and reception of the serial data. Data is written to
the SPI data register for the transmit case. There is no buffer between the data register and the internal shift
register. A write to the data register goes directly into the internal shift register. Therefore, data should only be
written to this register when a transmit is not currently in progress. Read data is buffered. When a transfer is
complete, the receive data is transferred to a single byte data buffer, where it is later read. A read of the SPI data
register returns the value of the read data buffer.

SPI clock counter register: The SPI clock counter register controls the clock rate when the SPI block is in master
mode. This needs to be set prior to a transfer taking place, when the SPI block is a master. This register has no
function when the SPI block is a slave.

The I/Os for this implementation of SPI are standard CMOS I/Os. The open drain SPI option is not implemented
in this design. When a device is set up to be a slave, its I/Os are only active when it is selected by the SSEL signal
being active.

Master operation

The following sequence describes how one should process a data transfer with the SPI block when it is set up to be
the master. This process assumes that any prior data transfer has already completed.
1. Set the SPI clock counter register to the desired clock rate.
2. Set the SPI control register to the desired settings.
3. Write the data to transmitted to the SPI data register. This write starts the SPI data transfer.
4. Wait for the SPIF bit in the SPI status register to be set to 1. The SPIF bit will be set after the last cycle of the
SPI data transfer.
5. Read the SPI status register.
6. Read the received data from the SPI data register (optional).
7. Go to step 3 if more data is required to transmit.
Note that a read or write of the SPI data register is required in order to clear the SPIF status bit. Therefore, if the
optional read of the SPI data register does not take place, a write to this register is required in order to clear the
SPIF status bit.

Slave operation

The following sequence describes how one should process a data transfer with the SPI block when it is set up to be
a slave. This process assumes that any prior data transfer has already completed. It is required that the system
clock driving the SPI logic be at least 8X faster than the SPI.

1. Set the SPI control register to the desired settings.


2. Write the data to transmitted to the SPI data register (optional). Note that this can only
be done when a slave SPI transfer is not in progress.
3. Wait for the SPIF bit in the SPI status register to be set to 1. The SPIF bit will be set
after the last sampling clock edge of the SPI data transfer.
4. Read the SPI status register.
5. Read the received data from the SPI data register (optional).
6. Go to step 2 if more data is required to transmit.

Note that a read or write of the SPI data register is required in order to clear the SPIF status bit. Therefore, at least
one of the optional reads or writes of the SPI data register must take place, in order to clear the SPIF status bit.

Pin description
SPI contains four pins. Function of these pins are as follows

SCK0 (Serial Clock): This is an Input/output pin, used to synchronize the transfer of data across the SPI
interface. The SPI is always driven by the master and received by the slave. The clock is programmable to be
active high or active low. The SPI is only active during a data transfer. Any other time, it is either in its inactive
state, or tri-stated.

SSEL0 (Slave Select): This is an Input pin and is an active low signal that indicates which slave is currently
selected to participate in a data transfer. Each slave has its own unique slave select signal input. The SSEL must be
low before data transactions begin and normally stays low for the duration of the transaction. If the SSEL signal
goes high any time during a data transfer, the transfer is considered to be aborted. In this event, the slave returns to
idle, and any data that was received is thrown away. There are no other indications of this exception. This signal is
not directly driven by the master. It could be driven by a simple general purpose I/O under software control.

On the LPC2148 the SSEL0 pin can be used for a different function when the SPI0 interface is only used in
Master mode. For example, pin hosting the SSEL0 function can be configured as an output digital GPIO pin and
used to select one of the SPI0 slaves.

MISO0 (Master In Slave Out): This is an Input/output pin. The MISO signal is a unidirectional signal used to
transfer serial data from the slave to the master. When a device is a slave, serial data is output on this signal. When
a device is a master, serial data is input on this signal. When a slave device is not selected, the slave drives the
signal high impedance.

MOSI0(Master Out Slave In): This is an Input/output pin. The MOSI signal is a unidirectional signal used to
transfer serial data from the master to the slave. When a device is a master, serial data is output on this signal.
When a device is a slave, serial data is input on this signal.

Register of SPI

The SPI contains 5 registers. All registers are byte, half word and word accessible. They are

1. SPI Control Register(S0SPCR):

2. SPI Status Register(S0SPSR )

3. SPI Data Register(S0SPDR )

4. SPI Clock Counter Register(S0SPCCR )

5. SPI Interrupt Flag(S0SPINT )

SPI Control Register (S0SPCR):

This register controls the operation of the SPI. This is a R/W register having address of 0xE002 0000. The
S0SPCR register controls the operation of the SPI0 as per the configuration bits setting. Function of each bit is
summarized in the following table.
- - - - CB CB CB CB SPIE LSBF MSTR CPOL CPHA BE - -

Reserved Control bits for data Reserved


transfer

Bit 0 to 1( Reserved): User software should not write ones to reserved bits. The value read from a reserved bit is
not defined.
Bit 2 (Bit Enable):
0 - The SPI controller sends and receives 8 bits of data per transfer
1 - The SPI controller sends and receives the number of bits selected
by bits 11:8
Bit 3 (CPHA): Clock phase control determines the relationship between the data and the clock on SPI transfers,
and controls when a slave transfer is defined as starting and ending.
0 - Data is sampled on the first clock edge of SCK. A transfer starts and ends with
activation and deactivation of the SSEL signal.
1 - Data is sampled on the second clock edge of the SCK. A transfer starts with
the first clock edge, and ends with the last sampling edge when the SSEL
signal is active.
Bit 4 (CPOL): Clock polarity control.
0 - SCK is active high.
1 - SCK is active low.
Bit 5 (MSTR): Master mode select.
0 - The SPI operates in Slave mode
1 - The SPI operates in Master mod
Bit 6 (LSBF): LSB First controls which direction each byte is shifted
when transferred.
0 - SPI data is transferred MSB (bit 7) first.
1 - SPI data is transferred LSB (bit 0) first
Bit 7 (SPIE): Serial peripheral interrupt enable.
0 - SPI interrupts are inhibited.
1 - A hardware interrupt is generated each time the SPIF or
WCOL bits are activated.
Bits 11 to 8 (BITS): When bit 2 of SPI Control register is 1, this field controls the
number of bits per transfer:
1000 - 8 bits per transfer
1001 - 9 bits per transfer
1010 - 10 bits per transfer
1011 - 11 bits per transfer
1100 - 12 bits per transfer
1101 - 13 bits per transfer
1110 - 14 bits per transfer
1111 - 15 bits per transfer
0000 - 16 bits per transfer

Bits 15 to 12 ( Reserved BITS): User software should not write ones to reserved bits. The value read from a
reserved bit is not defined.

Bit Symbol Value Description


1:0 - Reserved, user software should not write ones to reserved bits. The value read
from a reserved bit is not defined.
0 The SPI controller sends and receives 8 bits of data per transfer.
2 Bit Enable 1 The SPI controller sends and receives the number of bits selected by bits 11:8
Clock phase control determines the relationship between the data and the
clock on SPI transfers, and controls when a slave transfer is defined as
0 starting and ending. Data is sampled on the first clock edge of SCK. A
transfer starts and ends with activation and deactivation of the SSEL signal.
3 CPHA Data is sampled on the second clock edge of the SCK. A transfer starts with
1 the first clock edge, and ends with the last sampling edge when the SSEL
signal is active.
Clock polarity control.
4 CPOL 0 SCK is active high.
1 SCK is active low.
Master mode select.
5 MSTR 0 The SPI operates in Slave mode.
1 The SPI operates in Master mod
LSB First controls which direction each byte is shifted
when transferred
6 LSBF 0 SPI data is transferred MSB (bit 7) first.
1 SPI data is transferred LSB (bit 0) first.
7 SPIE Serial peripheral interrupt enable.
0 SPI interrupts are inhibited.
1 A hardware interrupt is generated each time the SPIF or
WCOL bits are activated.
When bit 2 of this register is 1, this field controls the
number of bits per transfer:
1000 8 bits per transfer
1001 9 bits per transfer
1010 10 bits per transfer
11:8 BITS 1011 11 bits per transfer
1100 12 bits per transfer
1101 13 bits per transfer
1110 14 bits per transfer
1111 15 bits per transfer
0000 16 bits per transfer
15:12 - Reserved, user software should not write ones to reserved bits. The value read
from a reserved bit is not defined.

SPI Status Register (S0SPSR):

This register shows the status of the SPI. The S0SPSR register controls the operation of the SPI0 as per the
configuration bits setting. This read only register having address0xE002 0004 .It consists of 8 bit and function of
them is summarized in the following table.

Table 199. SPI register map


Name Description Access Reset
value
Address
S0SPCR SPI Control Register.
R/W 0x00 S0SPSR SPI Status Register.
RO 0x00 S0SPDR SPI Data Register. This bi-directional register
provides the transmit and receive data for the
SPI. Transmit data is provided to the SPI0 by
writing to this register. Data received by the SPI0
can be read from this register.
R/W 0x00 0xE002 0008
S0SPCCR SPI Clock Counter Register. This register
controls the frequency of a master’s SCK0.
R/W 0x00 0xE002 000C
S0SPINT SPI Interrupt Flag. This register contains the
interrupt flag for the SPI interface.
R/W 0x00 0xE002 001C

General purpose timers/external event counters

The LPC2148 have a number of general purpose timers.


This module has2 set of Timer/Counter: Timer0 and Timer1.The size of both Timer/Counter is 32-bit. The exact
number will vary depending on the variant, but there are at least two timers. All of the general purpose timers are
identical in structure but vary slightly in the number of features supported. The timers are based around a 32-bit
timer counter with a 32-bit prescaler. The clock source for all of the timers is the VLSI peripheral clock PCLK

The Timer/Counter is designed to count cycles of the peripheral clock (PCLK) or an externally supplied clock and
optionally generate interrupts or perform other actions at specified timer values, based on four match registers. It
also includes four capture inputs to trap the timer value when an input signal transitions, optionally generating an
interrupt. Multiple pins can be selected to perform a single capture or match function, providing an application
with ‘or’ and ‘and’, as well as ‘broadcast’ functions among them. The LPC2148 can count external events on one
of the capture inputs if the minimum external pulse is equal or longer than a period of the PCLK. In this
configuration, unused capture lines can be selected as regular timer capture inputs, or used as external interrupts.
Features
 A 32-bit timer/counter with a programmable 32-bit prescaler.
 External event counter or timer operation.
 Four 32-bit capture channels per timer/counter that can take a snapshot of the timer
 value when an input signal transitions. A capture event may also optionally generate an interrupt.
 Four 32-bit match registers that allow:
– Continuous operation with optional interrupt generation on match.
– Stop timer on match with optional interrupt generation.
– Reset timer on match with optional interrupt generation.
 Four external outputs per timer/counter corresponding to match registers, with the following
capabilities:
– Set LOW on match.
– Set HIGH on match.
– Toggle on match.
– Do nothing on match.

Applications
• Interval Timer for counting internal events.
• Pulse Width Demodulator via Capture inputs.
• Free running timer.

Pin description
Timer/Counter pins are categorized in two groups as Capture Signals and External Match Output. Functions of
these pins are as follows.
CAP0.3 to 0.0 and CAP1.3 to 1.0: Input Capture Signals- A transition on a capture pin can be configured to
load one of the Capture Registers with the value in the Timer Counter and optionally generate an interrupt.
Capture functionality can be selected from a number of pins. When more than one pin is selected for a Capture
input on a single TIMER0/1 channel, the pin with the lowest Port number is used. If for example pins 30 (P0.6)
and 46 (P0.16) are selected for CAP0.2, only pin 30 will be used by TIMER0 to perform CAP0.2 function.
Here is the list of all CAPTURE signals, together with pins on where they can be selected:
• CAP0.0 (3 pins): Thease pins are connected to P0.2, P0.22 and P0.30
• CAP0.1 (1 pin): This pin is connected to P0.4
• CAP0.2 (3 pin): Thease pins are connected to P0.6, P0.16 and P0.28
• CAP0.3 (1 pin): This pin is connected to P0.29
• CAP1.0 (1 pin): This pin is connected to P0.10
• CAP1.1 (1 pin): This pin is connected to P0.11
• CAP1.2 (2 pins): Thease pins are connected to P0.17 and P0.19
• CAP1.3 (2 pins): Thease pins are connected to P0.18 and P0.21

Timer/Counter block can select a capture signal as a clock source instead of the PCLK derived clock.
MAT0.3 to 0.0 to MAT1.3 to1.0 Output External Match Output 0/1- When a match register 0/1 (MR3:0)
equals the timer counter (TC) this output can either toggle, go low, go high, or do nothing. The External Match
Register (EMR) controls the functionality of this output. Match Output functionality can be selected on a number
of pins in parallel. It is also possible for example, to have 2 pins selected at the same time so that they provide
MAT1.3 function in parallel.
Here is the list of all MATCH signals, together with pins on where they can be selected:
• MAT0.0 (2 pins): Thease pins are connected to P0.3 and P0.22
• MAT0.1 (1 pin): This pin is connected to P0.5
• MAT0.2 (2 pin): Thease pins are connected to P0.16 and P0.28
• MAT0.3 (1 pin): This pin is connected to P0.29
• MAT1.0 (1 pin): This pin is connected to P0.12
• MAT1.1 (1 pin): This pin is connected to P0.13
• MAT1.2 (2 pins): Thease pins are connected to P0.17 and P0.19
• MAT1.3 (2 pins): Thease pins are connected to P0.18 and P0.20

Timer / Counter Register: Each Timer/Counter contains the registers as shown below.
Interrupt Register (IR): This is a read write register. The IR can be written to clear interrupts. The IR can be read
to identify which of eight possible interrupt sources are pending. The Interrupt Register consists of four bits for
the match interrupts and four bits for the capture interrupts. If an interrupt is generated then the corresponding bit
in the IR will be high. Otherwise, the bit will be low. Writing a logic one to the corresponding IR bit will reset the
interrupt. Writing a zero has no effect.
Timer Control Register (TCR): This is a read write register The TCR is used to control the Timer Counter
functions. The Timer Counter can be disabled or reset through the TCR.
Timer Counter (TC): This is a read write register The 32-bit TC is incremented every PR+1 cycles of PCLK.
The TC is controlled through the TCR. The 32-bit Timer Counter is incremented when the Prescale Counter
reaches its terminal count. Unless it is reset before reaching its upper limit, the TC will count up through the value
0xFFFF FFFF and then wrap back to the value 0x0000 0000. This event does not cause an interrupt, but a Match
register can be used to detect an overflow if needed.
Prescale Register (PR): This is a read write register The Prescale Counter (below) is equal to this value, the next
clock increments the TC and clears the PC. The 32-bit Prescale Register specifies the maximum value for the
Prescale Counter.
Prescale Counter (PC): This is a read write register The 32-bit PC is a counter which is incremented to the value
stored in PR. When the value in PR is reached, the TC is incremented and the PC is cleared. The PC is observable
and controllable through the bus interface. The 32-bit Prescale Counter controls division of PCLK by some
constant value before it is applied to the Timer Counter. This allows control of the relationship of the resolution of
the timer versus the maximum time before the timer overflows. The Prescale Counter is incremented on every
PCLK. When it reaches the value stored in the Prescale Register, the Timer Counter is incremented and the
Prescale Counter is reset on the next PCLK. This causes the TC to increment on every PCLK when PR = 0, every
2 PCLKs when PR = 1, etc
Match Control Register (MCR): This is a read write register The MCR is used to control if an interrupt is
generated and if the TC is reset when a Match occurs. The Match Control Register is used to control what
operations are performed when one of the Match Registers matches the Timer Counter
Match Registers (MR0 to MR3): These are read write register MR0 to MR3 can be enabled through the MCR to
reset the TC, stop both the TC and PC, and/or generate an interrupt every time MR0 matches the TC. The Match
register values are continuously compared to the Timer Counter value. When the two values are equal, actions can
be triggered automatically. The action possibilities are to generate an interrupt, reset the Timer Counter, or stop
the timer. Actions are controlled by the settings in the MCR register.
Capture Control Register (CCR): This is a read write register The CCR controls which edges of the capture
inputs are used to load the Capture Registers and whether or not an interrupt is generated when a capture takes
place. The Capture Control Register is used to control whether one of the four Capture Registers is loaded with
the value in the Timer Counter when the capture event occurs, and whether an interrupt is generated by the capture
event. Setting both the rising and falling bits at the same time is a valid configuration, resulting in a capture event
for both edges. In the description below, "n" represents the Timer number, 0 or 1.

Capture Registers (CR0 to CR3): These are read only register. CR0 to CR3 are loaded with the value of TC
when there is an event on the CAPn.0(CAP0.0 or CAP1.0 respectively) input.
Each Capture register is associated with a device pin and may be loaded with the Timer Counter (TC) value when
a specified event occurs on that pin. The settings in the Capture Control Registers determine whether the capture
function is enabled, and whether a capture event happens on the rising edge of the associated pin, the falling edge,
or on both edges.
External Match Register (EMR): This is a read write register The EMR controls the external match pins
MATn.0-3 (MAT0.0-3 and MAT1.0-3 respectively). The External Match Register provides both control and
status of the external match pins MAT(0-3).
Count Control Register (CTCR): This is a read write register The CTCR selects between Timer and Counter
mode, and in Counter mode selects the signal and edge(s) for Counting. When Counter Mode is chosen as a mode
of operation, the CAP input (selected by the CTCR bits 3:2) is sampled on every rising edge of the PCLK clock.
After comparing two consecutive samples of this CAP input, one of the following four events is recognized: rising
edge, falling edge, either of edges or no changes in the level of the selected CAP input. Only if the identified event
corresponds to the one selected by bits 1:0 in the CTCR register, the Timer Counter register will be incremented.
Effective processing of the externally supplied clock to the counter has some limitations.
Since two successive rising edges of the PCLK clock are used to identify only one edge on the CAP selected
input, the frequency of the CAP input cannot exceed one fourth of the PCLK clock. Consequently, duration of the
high/

Pulse width modulator:


The PWM is based on the standard timer block and inherits all of its features, although only the PWM function is
pinned out on the LPC2148. The timer is designed to count cycles of the peripheral clock (PCLK) and optionally
generate interrupts or perform other actions when specified timer values occur, based on seven match registers.
The PWM function is also based on match register events. The ability to separately control rising and falling edge
locations allows the PWM to be used for more applications. For instance, multi-phase motor control typically
requires three non-overlapping PWM outputs with individual control of all three pulse widths and positions.
Two match registers can be used to provide a single edge controlled PWM output. One match register (MR0)
controls the PWM cycle rate, by resetting the count upon match. The other match register controls the PWM edge
position. Additional single edge controlled PWM outputs require only one match register each, since the
repetition rate is the same for all PWM outputs. Multiple single edge controlled PWM outputs will all have a
rising edge at the beginning of each PWM cycle, when an MR0 match occurs. Three match registers can be used
to provide a PWM output with both edges controlled. Again, the MR0 match register controls the PWM cycle
rate. The other match registers control the two PWM edge positions. Additional double edge controlled PWM
outputs require only two match registers each, since the repetition rate is the same for all PWM outputs. With
double edge controlled PWM outputs, specific match registers control the rising and falling edge of the output.
This allows both positive going PWM pulses (when the rising edge occurs prior to the falling edge), and negative
going PWM pulses (when the falling edge occurs prior to the rising edge).

Pin description:
LPC2148 has 6 pins namely PWM1 to PWM6 for pulse width modulation operation. Functions of these pins are
as follows
PWM1: This is an Output pin from PWM channel 1.
PWM2: This is an Output pin from PWM channel 2.
PWM3: This is an Output pin from PWM channel 3.
PWM4: This is an Output pin from PWM channel 4.
PWM5: This is an Output pin from PWM channel 5.
PWM6: This is an Output pin from PWM channel 6.

PWM Registers:
PWM Interrupt Register (PWMIR): This is a read write pin. The PWMIR can be written to clear interrupts.
The PWMIR can be read to identify which of the possible interrupt sources are pending. The PWM Interrupt
Register consists of eleven bits (See Appendix Table 249), seven for the match interrupts and four reserved for the
future use. If an interrupt is generated then the corresponding bit in the PWMIR will be high. Otherwise, the bit
will be low. Writing a logic one to the corresponding IR bit will reset the interrupt. Writing a zero has no effect
PWM Timer Control Register (PWMTCR): This is a read write pin The PWMTCR is used to control the
Timer Counter functions. The Timer Counter can be disabled or reset through the PWMTCR. The PWM Timer
Control Register (PWMTCR) is used to control the operation of the PWM Timer Counter. For detailed function of
each of the bits(see Appendix Table 250.)
PWM Timer Counter (PWMTC): This is a read write pin .The 32-bit TC is incremented every PWMPR+1
cycles of PCLK. The PWMTC is controlled through the PWMTCR.R/W. The 32-bit PWM Timer Counter is
incremented when the Prescale Counter reaches its terminal count. Unless it is reset before reaching its upper
limit, the PWMTC will count up through the value 0xFFFF FFFF and then wrap back to the value 0x0000 0000.
This event does not cause an interrupt, but a Match register can be used to detect an overflow if needed.
PWM Prescale Register (PWMPR): This is a read write pin. The PWMTC is incremented every
PWMPR+1cycles of PCLK. The 32-bit PWM Prescale Register specifies the maximum value for the PWM
Prescale Counter.
PWM Prescale Counter (PWMPC): This is a read write pin .The 32-bit PC is a counter which is incremented to
the value stored in PR. When the value in PWMPR is reached, the PWMTC is incremented. The PWMPC is
observable and controllable through the bus interface. The 32-bit PWM Prescale Counter controls division of
PCLK by some constant value before it is applied to the PWM Timer Counter. This allows control of the
relationship of the resolution of the timer versus the maximum time before the timer overflows. The PWM
Prescale Counter is incremented on every PCLK. When it reaches the value stored in the PWM Prescale Register,
the PWM Timer Counter is incremented and the PWM Prescale Counter is reset on the next PCLK. This causes
the PWM TC to increment on every PCLK when PWMPR = 0, every 2 PCLKs when PWMPR = 1, etc.
PWM Match Control Register (PWMMCR): This is a read write pin. The PWMMCR is used to control if an
interrupt is generated and if the PWMTC is reset when a Match occurs. The PWM Match Control Register is used
to control what operations are performed when one of the PWM Match Registers matches the PWM Timer
Counter. For detailed function of each of the bits (see Appendix Table 251.)
PWM Match Registers (PWMMR0 to PWMMR6): These are read write pins. PWMMR0 to PWMMR6 can
be enabled through PWMMCR to reset the PWMTC, stop both the PWMTC and PWMPC, and/or generate an
interrupt when it matches the PWMTC. In addition, a match between PWMMR0 and the PWMTC sets all PWM
outputs that are in single-edge mode, and sets PWM1 The 32-bit PWM Match register values are continuously
compared to the PWM Timer Counter value. When the two values are equal, actions can be triggered
automatically.
The action possibilities are to generate an interrupt, reset the PWM Timer Counter, or stop the timer. Actions are
controlled by the settings in the PWMMCR register.
PWM Control Register (PWMPCR): This is a read write pin .Enables PWM outputs and selects PWM channel
types as either single edge or double edge controlled. The PWM Control Register is used to enable and select the
type of each PWM channel. For detailed function of each of the bits (see Appendix Table 252.)
PWM Latch Enable Register (PWMLER): This is a read write pin .Enables use of new PWM match values.
The PWM Latch Enable Register is used to control the update of the PWM Match registers when they are used for
PWM generation. When software writes to the location of a PWM Match register while the Timer is in PWM
mode, the value is held in a shadow register. When a PWM Match 0 event occurs (normally also resetting the
timer in PWM mode), the contents of shadow registers will be transferred to the actual Match registers if the
corresponding bit in the Latch Enable Register has been set. At that point, the new values will take effect and
determine the course of the next PWM cycle. Once the transfer of new values has taken place, all bits of the LER
are automatically cleared. Until the corresponding bit in the PWMLER is set and a PWM Match 0 event occurs,
any value written to the PWM Match registers has no effect on PWM operation.
For example, if PWM2 is configured for double edge operation and is currently running, a
typical sequence of events for changing the timing would be:
• Write a new value to the PWM Match1 register.
• Write a new value to the PWM Match2 register.
• Write to the PWMLER, setting bits 1 and 2 at the same time.
• The altered values will become effective at the next reset of the timer (when a PWM
Match 0 event occurs).
The order of writing the two PWM Match registers is not important, since neither value will be used until after the
write to PWMLER. This insures that both values go into effect at the same time, if that is required. A single value
may be altered in the same way if needed. For detailed function of each of the bits (see Appendix Table 253.)

The Real Time Clock (RTC):


The Real Time Clock (RTC) is a set of counters for measuring time when system power is on, and optionally
when it is off. It uses little power in Power-down mode. On the LPC2148, the RTC can be clocked by a separate
32.768 KHz oscillator or by a programmable prescale divider based on the APB clock. Also, the RTC is powered
by its own power supply pin, VBAT, which can be connected to a battery or to the same 3.3 V supply used by the
rest of the device.
Features
• Measures the passage of time to maintain a calendar and clock.
• Ultra Low Power design to support battery powered systems.
• Provides Seconds, Minutes, Hours, Day of Month, Month, Year, Day of Week, and Day of Year.
• Dedicated 32 kHz oscillator or programmable prescaler from APB clock.
• Dedicated power supply pin can be connected to a battery or to the main 3.3 V.
Block diagram of RTC

RTC Oscillator

MUX
Clock Generator
Reference Clock
Divider(PreScaler)

Alarm
Timer Comparators Register
Counters s

Counter Increment Alarm Mask


interrupt enable Register

Interrupt Generator

RTC interrupts:
Interrupt generation is controlled through the Interrupt Location Register (ILR), Counter Increment
Interrupt Register (CIIR), the alarm registers, and the Alarm Mask Register (AMR). Interrupts are generated only
by the transition into the interrupt state. The ILR separately enables CIIR and AMR interrupts. Each bit in CIIR
corresponds to one of the time counters. If CIIR is enabled for a particular counter, then every time the counter is
incremented when an interrupt is generated. The alarm registers allow the user to specify a date and time for an
interrupt to be generated. The AMR provides a mechanism to mask alarm compares. If all non-masked alarm
registers match the value in their corresponding time counter, then an interrupt is generated. The RTC interrupt
can bring the microcontroller out of power-down mode if the RTC is operating from its own oscillator on the
RTCX1-2 pins. When the RTC interrupt is enabled for wakeup and its selected event occurs, XTAL1/2 pins
associated oscillator wakeup cycle is started. For details on the RTC based wakeup process

Registers of RTC:
The RTC includes a number of registers. The address space is split into four sections by functionality.
The first eight addresses are the Miscellaneous Register Group. The second set of eight locations are the Time
Counter Group. The third set of eight locations contain the Alarm Register Group. The remaining registers control
the Reference Clock Divider. The Real Time Clock includes the register (see Appendix Table 260).Detailed
descriptions of the registers follow.

Miscellaneous register group:

Interrupt Location Register (ILR):

The Interrupt Location Register is a 2-bit register that specifies which blocks are generating an interrupt (see
Appendix Table 262). Writing a one to the appropriate bit clears the corresponding interrupt. Writing a zero has
no effect. This allows the programmer to read this register and write back the same value to clear only the
interrupt that is detected by the read.

Clock Tick Counter (CTC) :


The Clock Tick Counter is read only. It can be reset to zero through the Clock Control Register (CCR). The CTC
consists of the bits of the clock divider counter. If the RTC is driven by the external 32.786 kHz oscillator,
subsequent read operations of the CTCR may yield an incorrect result. The CTCR is implemented as a 15-bit
ripple counter so that not all 15 bits change simultaneously. The LSB changes first, then the next, and so forth.
Since the 32.786 kHz oscillator is asynchronous to the CPU clock, it is possible for a CTC read to occur during the
time when the CTCR bits are changing resulting in an incorrect large difference between back-to-back reads. If
the RTC is driven by the PCLK, the CPU and the RTC are synchronous because both of their clocks are driven
from the PLL output. Therefore, incorrect consecutive reads cannot occur.

Clock Control Register (CCR) R/W


The clock register is a 5-bit register that controls the operation of the clock divide circuit. Each bit of the clock
register is described in(See Appendix Table 264.)

Counter Increment Interrupt Register (CIIR) R/W


The Counter Increment Interrupt Register (CIIR) gives the ability to generate an interrupt every time a counter is
incremented. This interrupt remains valid until cleared by writing a one to bit zero of the Interrupt Location
Register (ILR [0])

Alarm Mask Register (AMR) R/W


The Alarm Mask Register (AMR) allows the user to mask any of the alarm registers. (See Appendix Table 266)
shows the relationship between the bits in the AMR and the alarms. For the alarm function, every non-masked
alarm register must match the corresponding time counter for an interrupt to be generated. The interrupt is
generated only when the counter comparison first changes from no match to match. The interrupt is removed
when a one is written to the appropriate bit of the Interrupt Location Register (ILR). If all mask bits are
set, then the alarm is disabled.

Consolidated Time Registers (CTIME0 to CTIME2): RO


The values of the Time Counters can optionally be read in a consolidated format which allows the programmer to
read all time counters with only three read operations. The various registers are packed into 32-bit values (See
Appendix Table 267, Table 268, and Table 269). The least significant bit of each register is read back at bit 0, 8,
16, or 24.The Consolidated Time Registers are read only. To write new values to the Time Counters, the Time
Counter addresses should be used.

Time counter group:


The time value consists of the eight counters (See Appendix Table 270 and Table 271). These counters can be
read or written at the locations shown in Table 271of Appendix.
1. Seconds Counter (SEC): 6 Seconds value in the range of 0 to 59
2. Minutes Register (MIN): 6 Minutes value in the range of 0 to 59
3. Hours Register(HOUR): 5 Hours value in the range of 0 to 23
4. Day of Month Register (DOM): 5 Day of month value in the range of 1 to 28, 29, 30,
or 31 (depending on the month and whether it is a leap year).
5. Day of Week Register (DOW): Day of week value in the range of 0 to 6
6. Day of Year Register (DOY): Day of week value in the range of 0 to 6
7. Months Register (MONTH): 4 Month value in the range of 1 to 12
8. Years Register (YEAR): 12 Year value in the range of 0 to 4095

Alarm register group:


The alarm registers are shown in (See Appendix Table 272.) The values in these registers are compared with the
time counters. If all the unmasked alarm registers match their corresponding time counters then an interrupt is
generated. The interrupt is cleared when a one is written to bit one of the interrupt Location Register (ILR [1]).
1. Alarm value for Seconds (ALSEC) R/W: 6 Alarm value for Seconds.
2. Alarm value for Minutes (ALMIN) R/W: 6 Alarm value for Minutes.
3. Alarm value for Hours (ALHOUR) R/W: 6 Alarm value for Hours.
4. Alarm value for Day of Month (ALDOM) R/W: 6 Alarm value for Month.
5. Alarm value for Day of Week (ALDOW) R/W: 6 Alarm value for Week.
6. Alarm value for Day of Year (ALDOY) R/W: 6 Alarm value for year.
7. Alarm value for Months (ALMON) R/W: 6 Alarm value for Months.
8. Alarm value for Year(ALYEAR) R/W: 6 Alarm value for year.

Reference clock divider (prescaler):


The reference clock divider ( prescaler) allows generation of a 32.768 kHz reference clock from any peripheral
clock frequency greater than or equal to 65.536 kHz (2 × 32.768 kHz). This permits the RTC to always run at the
proper rate regardless of the peripheral clock rate. Basically, the Prescaler divides the peripheral clock (PCLK) by
a value which contains both an integer portion and a fractional portion. The result is not a continuous output at a
constant frequency, some clock periods will be one PCLK longer than others. However, the overall result can
always be 32,768 counts per second. The reference clock divider consists of a 13-bit integer counter and a 15-bit
fractional counter. The reasons for these counter sizes are as follows:
Prescaler value, integer portion (PREINT) counter:
This is a read write 13 bit counter. For frequencies that are expected to be supported by the LPC2148, a 13-bit
integer counter is required. This can be calculated as 160 MHz divided by 32,768 minus 1 = 4881 with a
remainder of 26,624. Thirteen bits are needed to hold the value 4881, but actually supports frequencies up to
268.4 MHz (32,768 × 8192). Bits 0 to Bits: 12 contains the integer portion of the RTC prescaler value and bit 13
is Reserved, user should not write 1 to this bit. The value read from this bit is not defined.
The Integer portion of the prescale value is calculated as:
PREINT= int (PCLK-32768) – 1
Prescaler value, Fraction portion (PREFRAC) counter:
This is a read write 15 bit counter. The remainder value could be as large as 32,767, which requires 15 bits. Bits 0
to Bits: 14 contains the fractional portion of the RTC prescaler value and bit 15 is Reserved, user should not write
1 to this bit. The value read from this bit is not defined.
The fractional portion of the prescale value is calculated as:
PREFRAC = PCLK − ((PREINT + 1) × 32768).
Example1:
Let the PCLK frequency is 65.537 kHz. Then:
PREINT = int (PCLK / 32768) − 1
= int (65.537x103 / 32768) – 1
=1

PREFRAC = PCLK - ([PREINT + 1] × 32768)


= 65.537x103 - ([1 + 1] × 32768)
=0
Example2:
Let the PCLK frequency is 10 MHz, Then,
PREINT = int (PCLK / 32768) − 1
= int (10x106 / 32768) – 1
= 304
PREFRAC = PCLK − ([PREINT + 1] × 32768)
= 10x106 − ([304 + 1] × 32768)
= 5,760.
Analog to Digital Converter (ADC):

Analog to digital converters are the most widely used devices for data acquisition. It is very much essential to
convert the physical quantities like pressure, humidity, temperature and velocity in to digital data for processing
with digital computers. First these physical quantities are converted into analog signals (either voltage or current)
with the help of transducers or sensors. Then these analog signals are converted into digital data using Analog to
digital converters (ADC), so that microcontrollers or processors can read digital data and process them. An ADC
has n-bit resolution where n can be 8,10,12,16 or 24 bits. The higher resolution ADC can provide a smaller step
size, where step size is the smallest change in analog data that an ADC can convert in to digital data.
The LPC2148 contain two analog to digital converters. These converters are single 10-bit successive
approximation analog to digital converters. While ADC0 has six channels, ADC1 has eight channels. Therefore,
total number of available ADC inputs for LPC2148 is 14.

Features
• 10 bit successive approximation analog to digital converter.
• Measurement range of 0 V to VREF (2.0 V ≤ VREF ≤ VDDA).
• Each converter capable of performing more than 400,000 10-bit samples per second.
• Every analog input has a dedicated result register to reduce interrupt overhead.
• Burst conversion mode for single or multiple inputs.
• Optional conversion on transition on input pin or timer match signal.
• Global Start command for both converters
Pin description
Analog Inputs
ADC0: AD0.7, AD0.6, AD0.4, AD0.3.AD0.2, AD0.1
ADC1: AD1.7, AD1.6, AD1.5, AD1.4, AD1.3, AD1.2, AD1.1, AD1.0.
The ADC cell can measure the voltage on any of these input signals. Note that these analog inputs are always
connected to their pins, even if the Pin function Select register assigns them to port pins. A simple self-test of the
ADC can be done by driving these pins as port outputs.

Note: if the ADC is used, signal levels on analog input pins must not be above the level of V3A at any time.
Otherwise, A/D converter readings will be invalid. If the A/D converter is not used in an application then the
pins associated with A/D inputs can be used as 5 V tolerant digital IO pins.

Warning: while the ADC pins are specified as 5 V tolerant, the analog multiplexing in the ADC block is not.
More than 3.3 V (VDDA) should not be applied to any pin that is selected as an ADC input, or the ADC reading
will be incorrect. If for example AD0.0 and AD0.1 are used as the ADC0 inputs and voltage on AD0.0 = 4.5 V
while AD0.1 = 2.5 V, an excessive voltage on the AD0.0 can cause an incorrect reading of the AD0.1, although
the AD0.1 input voltage is within the right range.
Voltage Reference (VREF): Pin no: 63
This pin is provides a voltage reference level for the A/D converter(s).LPC 2148 uses 3.3 volts as Vref.
Analog Power and Ground (VDDA, VSSA): Pin nos: 07 &59
These should be nominally the same voltages as VDD and VSS, but should be isolated to minimize noise and
error
ADC Registers:
The A/D Converter registers are shown in Table 279.
The LPC2148 contain Six registers which controls the function of ADC. These registers are
A/D Control Register (ADCR): This is a read /Write register, control data must be written into this register to
select the operating mode before A/D conversion can occur.
A/D Global Data Register (ADGDR): This is a read /Write register; it contains the ADC’s DONE bit and the
result of the most recent A/D conversion.
A/D Status Register (ADSTAT): This is a read only register, it contains DONE and OVERRUN flags for all of
the A/D channels, as well as the A/D interrupt flag.
A/D Global Start Register (ADGSR): This is a Write only register, address can be written (in the AD0 address
range) to start conversions in both A/D converters simultaneously.
A/D Interrupt Enable Register (ADINTEN): This is a read /Write register, it contains enable bits that allow the
DONE flag of each A/D channel to be included or excluded from contributing to the generation of an A/D
interrupt.
A/D Data Registers (ADDR): This is a read only register. Each channel contains separate ADDR as shown
below.
A/D Channel 1 Data Register (ADDR1). This register contains the result of the most recent conversion completed
on channel 1.
A/D Channel 2 Data Register (ADDR2). This register contains the result of the most recent conversion completed
on channel 2.
A/D Channel 3 Data Register (ADDR3). This register contains the result of the most recent conversion completed
on channel 3.
A/D Channel 4 Data Register (ADDR4). This register contains the result of the most recent conversion completed
on channel 4.
A/D Channel 5 Data Register (ADDR5). This register contains the result of the most recent conversion completed
on channel 5.
A/D Channel 6 Data Register (ADDR6). This register contains the result of the most recent conversion completed
on channel 6.
A/D Channel 7 Data Register. This register contains the result of the most recent conversion completed on
channel 7.
A/D Control Register (AD0CR and AD1CR): This is a 32 bit register; user must write control word on to this
register to select the mode of operation before the conversion starts. Function of each bits of this register is as
follows
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 8 7 0
- - - - EDGE START - - PDN - CLKS BRUST CLKDIV SEL

SEL (Bits 7:0):


Selects which of the AD0.7:0/AD1.7:0 pins is (are) to be sampled and converted. For AD0, bit 0 selects Pin
AD0.0, and bit 7 selects pin AD0.7. In software controlled mode, only one of these bits should be 1. For Example
if we write 0x00000004 to this register it will select Port pin 0.29 to function as AD0.2. In hardware scan mode,
any value containing 1 to 8 ones. All zeroes is equivalent to 0x01.
CLKDIV (Bits 15:8)
The APB clock (PCLK) is divided by (this value plus one) to produce the clock for the A/D converter, which
should be less than or equal to 4.5 MHz. Typically, Software should program the smallest value in this field that
yields a clock of 4.5 MHz or slightly less, but in certain cases (such as a high-impedance analog source) a slower
clock may be desirable.
BURST (Bits 16):
1 The AD converter does repeated conversions at the rate selected by the CLKS
field, scanning (if necessary) through the pins selected by 1s in the SEL field.
The first Conversion after the start corresponds to the least-significant 1 in the
SEL field, then higher numbered 1-bits (pins) if applicable. Repeated
conversions can be terminated by clearing this bit, but the conversion that’s in
progress when this bit is cleared will be completed.
Remark: START bits must be 000 when BURST = 1 or conversions will not start.
0 Conversions are software controlled and require 11 clocks.

CLKS (Bits 19:17):


This field selects the number of clocks used for each conversion in Burst mode, and the number of bits of accuracy
of the result in the RESULT bits of ADDR, between 000 11 clocks(10 bits) and 111 4 clocks (3 bits).

000 11 clocks / 10 bits


001 10 clocks / 9bits
010 9 clocks / 8 bits
011 8 clocks / 7 bits
100 7 clocks / 6 bits
101 6 clocks / 5 bits
110 5 clocks / 4 bits
111 4 clocks / 3 bits
Bit 20:
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined.
PDN (Bits 21):
1 The A/D converter is operational. 0
0 The A/D converter is in power-down mode.
Bits 23:22:
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined.
26:24 START (Bits)
When the BURST bit is 0, these bits control whether and when an A/D conversion is started:
000 No start (this value should be used when clearing PDN to 0).
001 Start conversion now.
010 Start conversion when the edge selected by bit 27 occurs on
P0.16/EINT0/MAT0.2/CAP0.2 pin.
011 Start conversion when the edge selected by bit 27 occurs on
P0.22/CAP0.0/MAT0.0 pin.
100 Start conversion when the edge selected by bit 27 occurs on MAT0.1.
101 Start conversion when the edge selected by bit 27 occurs on MAT0.3.
110 Start conversion when the edge selected by bit 27 occurs on MAT1.0.
111 Start conversion when the edge selected by bit 27 occurs on MAT1.1.
EDGE (Bits 27).
This bit is significant only when the START field contains 010-111. In these cases:
1 Start conversion on a falling edge on the selected CAP/MAT signal.
0 Start conversion on a rising edge on the selected CAP/MAT signal.
Bits 31:28.
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined.

A/D Global Start Register (ADGSR):


Software can write this register to simultaneously initiate conversions on both A/D controllers.

Bits 15:0
Reserved, user software should not write ones to reserved bits. The value read from a Reserved bit is not defined.
BURST (Bits 16):
1 The AD converters do repeated conversions at the rate selected by their
CLKS fields, scanning (if necessary) through the pins selected by 1s in their
SEL field. The first conversion after the start corresponds to the
least-significant 1 in the SEL field, then higher numbered 1-bits (pins) if
applicable. Repeated conversions can be terminated by clearing this bit, but
the conversion that’s in progress when this bit is cleared will be completed.
Remark: START bits must be 000 when BURST = 1 or conversions will not start.
0 Conversions are software controlled and require 11 clocks.
Bits 23:17:
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined.
START (26:24):
When the BURST bit is 0, these bits control whether and when an A/D conversion is started:
000 No start (this value should be used when clearing PDN to 0).

001 Start conversion now.


010 Start conversion when the edge selected by bit 27 occurs on
P0.16/EINT0/MAT0.2/CAP0.2 pin.
011 Start conversion when the edge selected by bit 27 occurs on
P0.22/CAP0.0/MAT0.0 pin.
100 Start conversion when the edge selected by bit 27 occurs on MAT0.1.
101 Start conversion when the edge selected by bit 27 occurs on MAT0.3.
110 Start conversion when the edge selected by bit 27 occurs on MAT1.0.
111 Start conversion when the edge selected by bit 27 occurs on MAT1.1.
EDGE (Bit 27):
This bit is significant only when the START field contains 010-111. In these cases:
1 Start conversion on a falling edge on the selected CAP/MAT signal.
0 Start conversion on a rising edge on the selected CAP/MAT signal.
Bits 31:28
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined

A/D Status Register (ADSTAT, ADC0: AD0STAT and ADC1: AD1STAT0):


The A/D Status register allows checking the status of all A/D channels simultaneously. The DONE and
OVERRUN flags appearing in the ADDRn register for each A/D channel are mirrored in ADSTAT. The interrupt
flag (the logical OR of all DONE flags) is also found
in ADSTAT.

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
0
OVER OVER OVER OVER OVER OVER OVER OVER DONE DONE DONE DONE DONE DONE DONE DONE
RUN7 RUN6 RUN5 RUN4 RUN3 RUN2 RUN1 RUN0 7 6 5 4 3 2 1 0

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17
16
AD
- - - - - - - - - - - - - - -
INT

DONE0 (Bit0): This bit mirrors the DONE status flag from the result register for A/D channel 0.
DONE1 (Bit1): This bit mirrors the DONE status flag from the result register for A/D channel 1.
DONE2 (Bit2): This bit mirrors the DONE status flag from the result register for A/D channel 2.
DONE3 (Bit3): This bit mirrors the DONE status flag from the result register for A/D channel 3.
DONE4 (Bit4): This bit mirrors the DONE status flag from the result register for A/D channel 4.
DONE5 (Bit5): This bit mirrors the DONE status flag from the result register for A/D channel 5.
DONE6 (Bit6): This bit mirrors the DONE status flag from the result register for A/D channel 6.
DONE7 (Bit7): This bit mirrors the DONE status flag from the result register for A/D channel 7.
OVERRUN0 (Bit8): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 0.
OVERRUN1 (Bit9): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 1.
OVERRUN2 (Bit10): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 2.
OVERRUN3 (Bit11): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 3.
OVERRUN4 (Bit12): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 4.
OVERRUN5 (Bit13): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 5.
OVERRUN6 (Bit14): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 6.
OVERRUN7 (Bit15): This bit mirrors the OVERRRUN status flag from the result register for A/D channel 7.
ADINT (Bit16): This bit is the A/D interrupt flag. It is one when any of the individual A/D channel Done flags is
asserted and enabled to contribute to the A/D interrupt via the ADINTEN register.
Bits 31:17: Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is
not defined.

A/D Interrupt Enable Register (ADINTEN, ADC0: AD0INTEN and ADC1: AD1INTEN):
This register allows control over which A/D channels generate an interrupt when a conversion is complete. For
example, it may be desirable to use some A/D channels to monitor sensors by continuously performing
conversions on them. The most recent results are read by the application program whenever they are needed. In
this case, an interrupt is not desirable at the end of each conversion for some A/D channels.
31 9 8 7 6 5 4 3 2 1 0

- ------- AD AD AD AD AD AD AD AD AD
GINTEN INTEN INTEN INTEN INTEN INTEN INTEN INTEN INTEN
8 7 6 5 4 3 2 1 0

ADINTEN0 (Bit0):
0 Completion of a conversion on ADC channel 0 will not generate an interrupt.
1 Completion of a conversion on ADC channel 0 will generate an interrupt.
ADINTEN1 (Bit1):
0 Completion of a conversion on ADC channel 1 will not generate an interrupt.
1 Completion of a conversion on ADC channel 1 will generate an interrupt.
ADINTEN2 (Bit2):
0 Completion of a conversion on ADC channel 2 will not generate an interrupt.
1 Completion of a conversion on ADC channel 2 will generate an interrupt.
ADINTEN3 (Bit3):
0 Completion of a conversion on ADC channel 3 will not generate an interrupt.
1 Completion of a conversion on ADC channel 3 will generate an interrupt.
ADINTEN4 (Bit4):
0 Completion of a conversion on ADC channel 4 will not generate an interrupt.
1 Completion of a conversion on ADC channel 4 will generate an interrupt.
ADINTEN5 (Bit5):
0 Completion of a conversion on ADC channel 5 will not generate an interrupt.
1 Completion of a conversion on ADC channel 5 will generate an interrupt.
ADINTEN6 (Bit6):
0 Completion of a conversion on ADC channel 6 will not generate an interrupt.
1 Completion of a conversion on ADC channel 6 will generate an interrupt.
ADINTEN7 (Bit7):
0 Completion of a conversion on ADC channel 7 will not generate an interrupt.
1 Completion of a conversion on ADC channel 7 will generate an interrupt.
ADGINTEN (Bit8):
0 Only the individual ADC channels enabled by ADINTEN7:0 will generate
interrupts.
1 Only the global DONE flag in ADDR is enabled to generate an interrupt.
Bits 31:17:
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined

A/D Data Registers :( ADDR0 to ADDR7, ADC0: AD0DR0 to AD0DR7 and ADC1: AD1DR0 to
AD1DR7).

The A/D Data Register hold the result when an A/D conversion is complete, and also include the flags that
indicate when a conversion has been completed and when a conversion overrun has occurred.

31 30 29 16 15 14 13 12 11 10 9 8 7 6 5 0
Over
Done - ---------- - R R R R R R R R R R - --- -
Run

Result bits
Bits 5:0:
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined.
RESULT (Bits15:6):
When DONE is 1, this field contains a binary fraction representing the voltage on the AIN pin, divided by the
voltage on the VREF pin (V/VREF). Zero in the field indicates that the voltage on the AIN pin was less than,
equal to, or close to that on VSSA, while 0x3FF indicates that the voltage on AIN was close to, equal to, or greater
than that on VREF.
Bits 29:16:
Reserved, user software should not write ones to reserved bits. The value read from a reserved bit is not defined.
OVERRUN (Bit 30):
This bit is 1 in burst mode if the results of one or more conversions was (were) lost and overwritten before the
conversion that produced the result in the RESULT bits. This bit is cleared by reading this register.
DONE (Bit 31):
This bit is set to 1 when an A/D conversion completes. It is cleared when this register is read.

Programming ADC:

This program generates a digital data equivalent to the voltage at P0.29 and loads the digital data on to AD0CR
register. This digital data is displayed as voltage on to the LCD

LPC 2148
3.3V

P0.29 (ADC0) 100Ω


10k Pot

#include <LPC214x.H> /* LPC214x definitions */


#include <stdio.h>
#include "lcd.h"
////////// Init ADC0 /////////////////
Init_ADC()
{
PINSEL1 = (PINSEL1 & ~(3 << 26)) | (1 << 26); // Configure Port pin 0.29 to function as AD0.2
}
////////// READ ADC0 CH:2 /////////////////
unsigned int Read_ADC()
{
unsigned int i=0;
AD0CR = 0x00200D04;
AD0CR |= 0x01000000; // Starts A/D Conversion
do
{
i = AD0GDR; // Reads A/D Data Register
} while ((i & 0x80000000) == 0); // Wait for end of A/D Conversion
return (i >> 6) & 0x03FF; // bit 6:15 is 10 bit AD value
}
////////// DISPLAY ADC VALUE /////////////////
Display_ADC()
{
unsigned int adc_value = 0;
char buf[4] = {4};
float voltage = 0.0;
adc_value = Read_ADC();
sprintf((char *)buf, "%3d", adc_value);
lcd_putstring16(0,"ADC VAL = 000 ");
lcd_putstring16(1,"VOLTAGE = 0.0 v ");
lcd_gotoxy(0,10);
lcd_putstring(buf);
voltage = (adc_value * 3.3) / 1024;
lcd_gotoxy(1,10);
sprintf(buf, "%1.1f", voltage);
lcd_putstring(buf);
}
////////// MAIN /////////////////
int main (void)
{
init_lcd();
Init_ADC();
delay(60000);
while(1)
{
Display_ADC();
delay(50000);
}
}

Sensor interfacing:
Sensors or transducers convert physical quantities such as temperature, light intensity, velocity, pressure and
speed etc to electrical signals depending on the transducer, the output generated is in the form of voltage,
current, resistance or capacitance. For example a thermistor converts temperature into resistance which in turn
converted onto voltage. There are simple and widely used liner sensors like LM 35 are available by National
Semiconductor Corp.

The LM35 series sensors are precision integrated circuit temperature sensors whose output voltage is
linearly proportional to the Celsius temperature. The LM35 requires no external calibration since it is
internally calibarated.It outputs 10mV for each degree of centigrade temarature.
Vref =3.3V
LM35

1 2 3

5V

LPC2148
ADC1
75E

1µf

As the LM 35 produces 10mV for every degree change in temperature and LPC 2148 uses 3.3V as V ref , we can
read up to 330° temperature and ADC1 input of LPC2148 correspond directly to the temperature as monitored
by LM35. This voltage at P0.28(ADC1) is then converted into equalint digital data and loads the digital data on
to AD0CR register. This digital data is displayed as voltage on to the LCD. Refer the table –
Temp (°C) ADC1(mV) AD0CR Value
LM 35 output

0 0 0000000000

1 10 0000000001

2 20 0000000010

3 30 0000000011

5 50 0000000101

10 100 0000001010

20 200 0000010100

30 300 0000011110

40 400 0000101000
Example program:

#include <LPC214x.H> /* LPC214x definitions */


#include <stdio.h>
#include "lcd.h"

////////// Init ADC0 /////////////////


Init_ADC()
{
PINSEL1 = (PINSEL1 & ~(3 << 24)) | (1 << 24); // Configure Port pin 0.28 to function as AD0.1
}

////////// READ ADC0 CH:2 /////////////////


unsigned int Read_ADC()
{
unsigned int i=0;

AD0CR = 0x00200D02;
AD0CR |= 0x01000000; // Start A/D Conversion
do
{
i = AD0GDR; // Read A/D Data Register
} while ((i & 0x80000000) == 0); // Wait for end of A/D Conversion
return (i >> 6) & 0x03FF; // bit 6:15 is 10 bit AD value
}
////////// DISPLAY ADC VALUE /////////////////
Display_ADC()
{
unsigned int adc_value = 0;
char buf[4] = {5};
float voltage = 0.0;
adc_value = Read_ADC();
sprintf((char *)buf, "%3d", adc_value);
lcd_putstring16(0,"ADC VAL = 000 ");
lcd_putstring16(1,"TEMP = 000.0 dgC");
lcd_gotoxy(0,10);
lcd_putstring(buf);
}
////////// MAIN /////////////////
int main (void)
{
init_lcd();
Init_ADC();
lcd_putstring16(0,"** ADC - TEMP **");
lcd_putstring16(1,"** DEMO **");
delay(60000);
delay(60000);

while(1)
{
Display_ADC();
delay(50000);
}
}

Digital to Analog Converter (DAC):


The DAC enables the LPC2148 to generate a variable analog output. The maximum DAC output voltage is the
VREF voltage.
Features:
• 10 bit digital to analog converter
• Resistor string architecture
• Buffered output
• Power-down mode
• Selectable speed vs. power

In LPC2148 the digital inputs are converted to analog output voltage at pin DACOUT(P0.25).The total voltage
provided by the DACOUT pin is function of binary number written on to bits 6 (D0) to bit 15(D9) of DAC
Register(DACR) and the reference voltage Vref(3.3V),it is as follows

𝐷9 𝐷8 𝐷7 𝐷6 𝐷5 𝐷4 𝐷3 𝐷2 𝐷1 𝐷0
Vout =Vref [
2
+
4
+
8
+
16
+
32
+
64
+
128
+
256
+
512
+
1024
]
Where D0 is the LSB and D9 is MSB
Example:
Find the analog voltage for the digital data 1056 loaded on to DACR
Solution:
First the digital value is converted to binary, is equal to 10000100000 and Vref=3.3V
Now the analog output voltage is calculated as

𝐷9 𝐷8 𝐷7 𝐷6 𝐷5 𝐷4 𝐷3 𝐷2 𝐷1 𝐷0
Vout =Vref [
2
+
4
+
8
+
16
+
32
+
64
+
128
+
256
+
512
+
1024
]
1 0 0 0 0 0 1 0 0 0
Vout =3.3 [ 2
+ + +
4 8 16
+
32
+
64
+
128
+
256
+
512
+
1024
]
Vout =3.3 [ 0.5 + 0.078 ]
Vout =1.678Volts
Pin description
Table 286 gives a brief summary of each of DAC related pins.
Analog Output (AOUT):
This is Analog output pin. Analog voltage equal to (10 bit digital VALUE)/1024 x V REF is generated on this pin
after the selected settling time after the bit 6 to 15 of DACR is written with a new value.
Voltage Reference (VREF):
This pin provides a voltage reference level for the D/A converter.
Analog Power and Ground (VDDA, VSSA):
These should be nominally the same voltages as V3 and VSSD, but should be isolated to minimize noise and error.

DAC Register (DACR):


This read/write register includes the digital value to be converted to analog, and a bit that decides the settling time
and power. For future, higher-resolution D/A converters Some 22 Bits are reserved

31 17 16 15 14 13 12 11 10 9 8 7 6 5 0
0
- ---------- - BIAS V9 V8 V7 V6 V5 V4 V3 V2 V1 V0 - ---- -

10 bit Digital Value equivalent to Analog voltage


Bits 5:0:
These are reserved bits, user should not write ones to reserved bits. The values read from these bits are not
defined.
VALUE (Bits15:6):
This field is written with 10 bit digital value which is going to be converted after settling time and after the
selected settling time, the Analog voltage on the AOUT pin (with respect to VSSA) is =VALUE/1024 × VREF.
BIAS (Bit16):
The maximum current output and settling time for DAC is decided by this bit.
If this bit value is 0 then the settling time of the DAC is 1 μs max, and the maximum current is 700 μA.
If this bit value is 1 then the settling time of the DAC is 2.5 μs and the maximum current is 350 μA.
Bits 31:17:
These are reserved bits, user should not write ones to reserved bits.

Note: Before loading digital value on to DACR we need to shift the digital value left by six (6)
bits so that the digital value will be loaded to D6 to D15 of DACR.

Generation of Square Wave:

LPC2148

DAC out
Analog out to CRO
Square wave can be generated by loading “00” and ‘1023” values to DACR of LPC 2148 with the required
amount of delay between the two values.
Example Prog:

#include <LPC214x.H> /* LPC214x definitions */


Init_DAC()
{
PINSEL1 = (PINSEL1 & ~(3 << 18)) | (1 << 19); // Configure Port pin 0.25 to function as DAC
DACR = 0;
}

Write_DAC(unsigned int dacval)


{
DACR = dacval << 6;
}

////////// MAIN /////////////////


int main (void)
{
Init_DAC();
while(1)
{
Write_DAC(00);
delay(100); // change this value to change Frequency
Write_DAC(1023); // change this value to change Amplitude
delay(100); // change this value to change Frequency
}
}

Generation of Saw Tooth wave:

LPC2148

DAC out
Analog out to CRO

Saw Tooth wave can be generated by incrementing the DACR value from “00” and ‘1023” and then
loading “00” continuously. In this case the maximum value of digital number is “1023’. Hence maximum
output voltage available at DACout pin is

𝐷9 𝐷8 𝐷7 𝐷6 𝐷5 𝐷4 𝐷3 𝐷2 𝐷1 𝐷0
Vout =Vref [ + + + + + + + + + ]
2 4 8 16 32 64 128 256 512 1024

1 1 1 1 1 1 1 1 1 1
Vout =3.3 [ + + + + + + + + + ]
2 4 8 16 32 64 128 256 512 1024

1023
Vout =3.3 [
1024
]
Vout =3.29Volts

Example Prog:

#include <LPC214x.H> /* LPC214x definitions */


#include "lcd.h"
Init_DAC()
{
PINSEL1 = (PINSEL1 & ~(3 << 18)) | (1 << 19); // Configure Port pin 0.25 to function as DAC
DACR = 0;
}
Write_DAC(unsigned int dacval)
{
DACR = dacval << 6;
}
////////// MAIN /////////////////
int main (void)
{
unsigned int i;
Init_DAC();
while(1)
{
for(i=0;i<1024;i++)
Write_DAC(i);
}
Write_DAC(00);
}

Generation of Sine wave:

LPC2148

DAC out
Analog out to CRO

To generate Sine wave we first need a table whose value represents the magnitude of sine wave of angle between
0 to 360°. The values of sine function vary from -1.0 to +1.0 for 0 to 360° angle. Therefore the table values are
integer numbers representing the voltage magnitude for sine of ‘ϴ’. Table123 shows the angles of sine values,
the voltage magnitudes and the integer values representing the voltage magnitude of each angle ( with 30 degree
increment).To generate table123 we considered 3.3V as the full scale voltage for DAC output. Full scale output
of DAC is obtained when all the data inputs of DACR are high. Therefore to obtain the full scale of 3.3V output
we use the following equation,

Aout = 1.65+ (1.65 x Sinϴ)

Aout values of DAC for various angles is calculated and shown in table123
Angle ϴ (degree) Sinϴ Aout (voltage Magnitude) DACR Value
1.65+ (1.65 x Sinϴ) Aoutx 310.3

0 0 1.65 512
15 0.26 2.08 645
30 0.50 2.48 770
45 0.71 2.82 875
60 0.87 3.09 959
75 0.97 3.25 1008
90 1.00 3.30 1024
105 0.97 3.25 1008
120 0.87 3.09 959
135 0.71 2.82 875
150 0.50 2.48 770
165 0.26 2.08 645
180 0.00 1.65 512
195 -0.26 1.22 379
210 -0.50 0.83 258
225 -0.71 0.48 149
240 -0.86 0.23 71
255 -0.97 0.05 16
270 -1.00 0.00 0
285 -0.97 0.05 16
300 -0.87 0.21 65
315 -0.71 0.48 149
330 -0.50 0.83 258
345 -0.26 1.22 379
360 0.00 1.65 512

To find the values to write into the DACR, simply multiply the Aout value by 310.0, because there are 1024(2 10)
steps and full scale is 3.3V.therefore 1024 steps/full scale= 1024/3.3=310.3 steps per volts and voltage per steps
is 1V/steps per volt =1/310.3 =0.003V.look at the following example, which continuously sends digital values to
DACR to produce a pure sine wave. Here we are sending digital values in 256 steps so that smooth curve will be
obtained. We can also use the formula arry[i] = Int((Sin((2 * i * Pi) / 256) * 512) + 512) where i can take the
values of 0 to 255.
Example Program:

#include <LPC214x.H> /* LPC214x definitions */

const unsigned int tab[256] = {512, 525, 537, 550, 562, 575, 587, 599, 612, 624, 636, 648, 661, 673, 684,
696, 708, 719, 731, 742, 753, 764, 775, 786, 796, 807, 817, 827, 837, 846, 856, 865,
874, 883, 891, 900, 908, 915, 923, 930, 938, 944, 951, 957, 963, 969, 975, 980, 985,
990, 994, 998, 1002, 1005, 1009, 1011, 1014, 1016, 1018, 1020, 1021, 1023, 1023,
1024,1024,1024,1023,1023,1022, 1020, 1019, 1017, 1014, 1012, 1009, 1006, 1002,
998, 994, 990, 985, 980, 975, 970, 964, 958, 951, 945, 938, 931, 924, 916, 908, 900,
892, 883, 874, 865, 856, 847, 837, 828, 818, 807, 797, 786, 765, 754, 743, 732, 720,
709, 697, 685, 673, 661, 649, 637, 625, 613, 600, 588, 575, 563, 550, 538, 525, 513,
500, 488, 475, 463, 450, 438, 425, 413, 401, 388, 376, 364, 352, 340, 329, 317, 305,
294, 283, 271, 260, 250, 239, 228, 218, 208, 198, 188, 178, 169, 160, 151, 142, 133,
125, 117, 109, 101, 94, 87, 80, 73, 67, 61, 55, 50, 44, 39, 35, 30, 26, 22, 19, 16, 13, 10,
8, 6, 4, 3, 1, 1, 0, 0,1 , 1, 2, 4, 5, 7, 10, 12, 15, 18, 22, 25, 29, 34, 38, 43, 49, 54, 60,
66, 72, 79, 86, 93, 100, 108, 115, 123, 132, 140, 149, 158, 167, 176, 186, 196, 206,
216, 226, 237, 248, 258, 269, 280, 292, 303, 315, 326, 338, 350, 362, 374, 386, 398,
411, 423, 435, 448, 460, 473, 485, 498 };

////////// Init DAC /////////////////


Init_DAC()
{
PINSEL1 = (PINSEL1 & ~(3 << 18)) | (1 << 19); // Configure Port pin 0.25 to function as DAC
DACR = 0;
}
Write_DAC(unsigned int dacval)
{
DACR = dacval << 6;
}
////////// MAIN /////////////////
int main (void)
{
unsigned int i;
Init_DAC();
while(1)
{
for(i=0;i<255;i++)
Write_DAC(tab[i]);
}
}
Chapter 9
Interfacing with Real Time Devices
Introduction:
This chapter explains the interfacing of some of the real time devices such as LCD, Seven segment Display,
Stepper Motor, Dc Motor, Relay and HEX keypad with LPC2148.We explain the basic working principle and
construction of these devices along with the interfacing program. All the program is in “C”

Liquid Crystal Displays (LCD):


An LCD display is specifically manufactured to be used with microcontrollers, which means that it cannot be
activated by standard IC circuits. It is used for displaying different messages on a miniature liquid crystal
display.

The model described here is for its low price and great capabilities most frequently used in practice. It is based
on the HD44780 microcontroller (Hitachi) and can display messages in two lines with 16 characters each. It
displays all the letters of alphabet, Greek letters, punctuation marks, mathematical symbols etc. In addition, it is
possible to display symbols made up by the user. Other useful features include automatic message shift (left and
right), cursor appearance, LED backlight etc.

LCD Pins
There are pins along one side of a small printed board. These are used for connecting to the microcontroller.
There are in total of 14 pins marked with numbers (16 if it has backlight). Their function is described in the table
below:

Pin
Function Name Logic State Description
Number
Ground 1 Vss - 0V
Power supply 2 Vdd - +5V
Contrast 3 Vee - 0 - Vdd

4 RS 0 D0 – D7 are interpreted as commands


1 D0 – D7 are interpreted as data

5 R/W 0
Control of 1 Write data (from controller to LCD)
operating Read data (from LCD to controller)

0
6 E
Access to LCD disabled Normal operating
1 Data/commands are transferred to LCD
7 D0 0/1 Bit 0 LSB
8 D1 0/1 Bit 1
9 D2 0/1 Bit 2
Data / 10 D3 0/1 Bit 3
commands 11 D4 0/1 Bit 4
12 D5 0/1 Bit 5
13 D6 0/1 Bit 6
14 D7 0/1 Bit 7 MSB

LCD screen:

An LCD screen consists of two lines each containing 16 characters. Each character consists of 5x7, 5x8 or 5x11
dot matrix. This book covers the most commonly used display, i.e. the 5x7 character display.

Display contrast depends on the power supply voltage and whether messages are displayed in one or two lines.
For this reason, varying voltage 0-Vdd is applied on the pin marked as Vee. Trimmer potentiometer is usually
used for that purpose. Some LCD displays have built-in backlight (blue or green LEDs). When used during
operation, a current limiting resistor should be serially connected to one of the pins for backlight power supply
(similar to LEDs).

If there are no characters displayed or if all of them are dimmed when the display is on, the first thing that should
be done is to check the potentiometer for contrast regulation. Is it properly adjusted? The same applies if the
mode of operation has been changed (writing in one or two lines).
LCD Memory:

The LCD display contains three memory blocks:

 DDRAM Display Data RAM;


 CGRAM Character Generator RAM; and
 CGROM Character Generator ROM.
DDRAM Memory:
DDRAM memory is used for storing characters to be displayed. The size of this memory is sufficient for storing
80 characters. Some memory locations are directly connected to the characters on display.

It works quite simply: it is sufficient to configure the display so as to increment addresses automatically (shift
right) and set the starting address for the message that should be displayed (for example 00 hex).

After that, all characters sent through lines D0-D7 will be displayed in the message format we are used to- from
left to right. In this case, displaying starts from the first field of the first line since the address is 00 hex. If more
than 16 characters are sent, then all of them will be memorized, but only the first sixteen characters will be
visible. In order to display the rest of them, a shift command should be used. Virtually, everything looks as if the
LCD display is a “window” which moves left-right over memory locations containing different characters. This
is how the effect of message “moving” on the screen is made.

If the cursor is on, it appears at the location which is currently addressed. In other words, when a character
appears at the cursor position, it will automatically move to the next addressed location.
Since this is a sort of RAM memory, data can be written to and read from it, but its contents is irretrievably lost
when the power goes off.

CGROM Memory:
CGROM memory contains the default character map with all characters that can be displayed on the screen.
Each character is assigned to one memory location.
The addresses of CGROM memory locations match the characters of ASCII. If the program being currently
executed encounters a command “send character P to port”, then the binary value 0101 0000 appears on the port.
This value is the ASCII equivalent to the character P. It is then written to LCD, which results in displaying the
symbol from 0101 0000 location of CGROM. In other words, the character “P” is displayed. This applies to all
letters of alphabet (capitals and small), but not to numbers.

As seen on the previous “map”, addresses of all digits are pushed forward by 48 relative to their values (digit 0
address is 48, digit 1 address is 49, digit 2 address is 50 etc.). Accordingly, in order to display digits correctly,
each of them needs to be added a decimal number 48 prior to be sent to LCD.

CGRAM memory:

Apart from standard characters, the LCD display can also display symbols defined by the user itself. It can be
any symbol in the size of 5x8 pixels. RAM memory called CGRAM in the size of 64 bytes enables it.

Memory registers are 8 bits wide, but only 5 lower bits are used. Logic one (1) in every register represents a
dimmed dot, while 8 locations grouped together represent one character. It is best illustrated in figure below:

Symbols are usually defined at the beginnig of the program by simply writing zeros and ones to registers of
CGRAM memory so that they form desired shapes. In order to display them it is sufficient to specify their
address. Pay attention to the first coloumn in the CGROM map of characters. It doesn't contain RAM memory
addresses, but symbols being discussed here. In this example, “display 0” means - display “č”, “display 1”
means - display “ž” etc.

LCD Basic Commands:


All data transferred to LCD through the outputs D0-D7 will be interpreted as a command or a data, which
depends on the pin RS logic state:

RS = 1 - Bits D0-D7 are addresses of the characters to be displayed. LCD processor addresses one character
from the character map and displays it. The DDRAM address specifies the location on which the character is to
be displayed. This address is defined before the character is transferred or the address of previously transferred
character is automatically incremented.

RS = 0 - Bits D0 - D7 are commands which determine the display mode. The commands recognized by the LCD
are given in the table below:

Execution
Command RS RW D7 D6 D5 D4 D3 D2 D1 D0
Time
Clear display 0 0 0 0 0 0 0 0 0 1 1.64mS
Cursor home 0 0 0 0 0 0 0 0 1 x 1.64mS
Entry mode set 0 0 0 0 0 0 0 1 I/D S 40uS
Display on/off control 0 0 0 0 0 0 1 D U B 40uS
Cursor/Display Shift 0 0 0 0 0 1 D/C R/L x x 40uS
Function set 0 0 0 0 1 DL N F x x 40uS
Set CGRAM address 0 0 0 1 CGRAM address 40uS
Set DDRAM address 0 0 1 DDRAM address 40uS
Read “BUSY” flag (BF) 0 1 BF DDRAM address -
Write to CGRAM or DDRAM 1 0 D7 D6 D5 D4 D3 D2 D1 D0 40uS
Read from CGRAM or DDRAM 1 1 D7 D6 D5 D4 D3 D2 D1 D0 40uS
I/D 1 = Increment (by 1) R/L 1 = Shift right
0 = Decrement (by 1) 0 = Shift left

S 1 = Display shift on DL 1 = 8-bit interface


0 = Display shift off 0 = 4-bit interface

D 1 = Display on N 1 = Display in two lines


0 = Display off 0 = Display in one line

U 1 = Cursor on F 1 = Character format 5x10 dots


0 = Cursor off 0 = Character format 5x7 dots

B 1 = Cursor blink on D/C 1 = Display shift


0 = Cursor blink off 0 = Cursor shift

LCD Command codes:

LCD Command (HEX) Description

01 Clear Screen
02 Return Home
04 Shift Cursor to left (decrement cursor)
05 Shift display right
06 Shift cursor to right (increment cursor)
07 Shift display left
08 Display off, Cursor off
0A Display off, Cursor on
0C Display on, Cursor off
0E Display on, Cursor on
0F Display off, Cursor blinking
10 Shift cursor position left
14 Shift cursor position right
18 Shift the display position left
1C Shift the display position right
80 Cursor to begin from 1st line
C0 Cursor to begin from 2nd line
38 2lines, 5x7 Dot matrix display
Table: LCD Command codes

Function of Busy flag:


Compared to the microcontroller, the LCD is an extremely slow component. Because of this, it was necessary to
provide a signal which will, upon command execution, indicate that the display is ready to receive a new data.
That signal, called the busy flag, can be read from line D7. When the BF bit is cleared (BF=0), the display is
ready to receive a new data.

LCD Connection
Depending on how many lines are used for connecting the LCD to the microcontroller, there are 8-bit and 4-bit
LCD modes. The appropriate mode is selected at the beginning of the operation. This process is called
“initialization”. 8-bit LCD mode uses outputs D0-D7 to transfer data in the way explained on the previous page.
The main purpose of 4-bit LCD mode is to save valuable I/O pins of the microcontroller. Only 4 higher bits
(D4-D7) are used for communication, while other may be left unconnected. Each data is sent to the LCD in two
steps: four higher bits are sent first (normally through the lines D4-D7), then four lower bits. Initialization
enables the LCD to link and interpret received bits correctly. Data is rarely read from the LCD (it is mainly
transferred from the microcontroller to LCD) so that it is often possible to save an extra I/O pin by simple
connecting R/W pin to ground. Such saving has its price. Messages will be normally displayed, but it will not be
possible to read the busy flag since it is not possible to read the display either.

Fortunately, there is a simple solution. After sending a character or a command it is important to give the LCD
enough time to do its job. Owing to the fact that execution of the slowest command lasts for approximately
1.64mS, it will be sufficient to wait approximately 2mS for LCD.

LCD Initialization
The LCD is automatically cleared when powered up. It lasts for approximately 15mS. After that, the display is
ready for operation. The mode of operation is set by default. It means that:

1. Display is cleared
2. Mode
o DL = 1 Communication through 8-bit interface
o N = 0 Messages are displayed in one line
o F = 0 Character font 5 x 8 dots
3. Display/Cursor on/off
o D = 0 Display off
o U = 0 Cursor off
o B = 0 Cursor blink off
4. Character entry
o ID = 1 Displayed addresses are automatically incremented by 1
o S = 0 Display shift off

Automatic reset is in most cases performed without any problems. In most cases, but not always! If for any
reason the power supply voltage does not reach ful value within 10mS, the display will start to perform
completely unpredictably. If the voltage supply unit is not able to meet this condition or if it is needed to provide
completely safe operation, the process of initialization is applied. Initialization, among other things, causes a
new reset enabling display to operate normally.
Refer to the figure below for the procedure on 8-bit initialization:

Example program: This program display the message “vasundhara Technologies” on 16x2 LCD display.

#include <LPC214x.H> /* LPC214x definitions */


#include "lcd.h"
#ifndef _LCD_H
#define _LCD_H

#define TRUE 1
#define FALSE 0

#define LINE1 0x80


#define LINE2 0xC0

#define CONTROL_REG 0x00


#define DATA_REG 0x01

void delay(unsigned int count);


void init_lcd(void);
void lcd_putstring(char *string);
void lcd_putstring16(unsigned char line, char *string);
void lcd_clear(void);
int lcd_gotoxy(unsigned char x, unsigned char y);
void lcd_putchar(unsigned char c);
#endif

#define LCD_DATA_DIR IO0DIR


#define LCD_DATA_SET IO0SET
#define LCD_DATA_CLR IO0CLR
#define LCD_CTRL_DIR IO0DIR
#define LCD_CTRL_SET IO0SET
#define LCD_CTRL_CLR IO0CLR
#define LCDEN (1 << 2)
#define LCDRS (1 << 3)
#define LCD_DATA_MASK 0x007F8000
int main (void)
{
init_lcd();
while(1)
{
lcd_putstring16(0,"# VASUNDHARA #");
lcd_putstring16(1,"# TECHNOLOGIES #");
delay(50000);
}
}
/************************************************************************************
Function Name : init_lcd()
Description :
Input :
Output : Void
************************************************************************************/
void init_lcd( void )
{
LCD_CTRL_DIR |= ( LCDEN | LCDRS );
LCD_CTRL_CLR |= ( LCDEN | LCDRS );
LCD_DATA_DIR |= LCD_DATA_MASK;

delay(1000);
lcd_command_write(0x38); /* 8-bit interface, two line, 5X7 dots. */
lcd_command_write(0x38);
lcd_command_write(0x38);
lcd_command_write(0x10); /* display shift */
lcd_command_write(0x0C); /* display on */
lcd_command_write(0x06) ; /* cursor move direction */
lcd_command_write(0x01) ; /* cursor home */
delay(1000);
}
/************************************************************************************
Function Name : lcd_command_write()
Description :
Input :
Output : Void
************************************************************************************/
void lcd_command_write( unsigned char command )
{
unsigned int temp=0;
temp=(command << 15) & LCD_DATA_MASK;
LCD_DATA_CLR |= LCD_DATA_MASK;
LCD_DATA_SET |= temp;
LCD_CTRL_CLR |= LCDRS;
LCD_CTRL_SET |= LCDEN;
delay(5);
LCD_CTRL_CLR |= LCDEN;
delay(5);
}

/************************************************************************************
Function Name : lcd_data_write()
Description :
Input :
Output : Void
************************************************************************************/
void lcd_data_write( unsigned char data )
{
unsigned int temp=0;
temp=(data << 15) & LCD_DATA_MASK;
LCD_DATA_CLR |= LCD_DATA_MASK;
LCD_DATA_SET |= temp;
LCD_CTRL_SET |= LCDEN | LCDRS;
delay(5);
LCD_CTRL_CLR |= LCDEN;
delay(5);
}
/************************************************************************************
Function Name : lcd_clear()
Description :
Input :
Output : Void
************************************************************************************/
void lcd_clear( void)
{
lcd_command_write( 0x01 );
}
/************************************************************************************
Function Name : lcd_gotoxy()
Description :
Input :
Output : Void
****************************************************************************** ******/
int lcd_gotoxy( unsigned char x, unsigned char y)
{
unsigned char retval = TRUE;
if( (x > 1) && (y > 15) )
{
retval = FALSE;
}
else
{
if( x == 0 ) lcd_command_write( 0x80 + y );
else if( x==1 ) lcd_command_write( 0xC0 + y );
}
return retval;
}

/************************************************************************************
Function Name : lcd_putchar()
Description :
Input :
Output : Void
************************************************************************************/
void lcd_putchar( unsigned char c )
{
lcd_data_write( c );
}
/************************************************************************************
Function Name : lcd_putstring()
Description :
Input :
Output : Void
************************************************************************************/
void lcd_putstring( char *string )
{
while(*string != '\0')
{
lcd_putchar( *string );
string++;
}
}
/************************************************************************************
Function Name : lcd_putstring16()
Description :
Input :
Output : Void
************************************************************************************/
void lcd_putstring16( unsigned char line, char *string )
{
unsigned char len = 16;
lcd_gotoxy( line, 0 );
while(*string != '\0' && len--)
{
lcd_putchar( *string );
string++;
}
}
/****************************************************************************
Function Name : delay()
Description :
Input :
Output : void
********************************************************************************/
void delay(unsigned int count)
{
int j=0,i=0;
for(j=0;j<count;j++)
{
for(i=0;i<120;i++);
}
}

Interfacing Seven Segment Display:

Basically, an LED display is nothing more than several LEDs molded in the same plastic
case. There are many types of displays composed of several dozens of built in diodes which
can display different symbols.

Most commonly used is a so called 7-segment display. It is composed of 8 LEDs, 7 segments are arranged as a
rectangle for symbol displaying and there is an additional segment for decimal point displaying. In order to
simplify connecting, anodes and cathodes of all diodes are connected to the common pin so that there are
common anode displays and common cathode displays, respectively. Segments are marked with the letters from
A to G, plus dp, as shown in the figure on the left. On connecting, each diode is treated separately, which means
that each must have its own current limiting resistor.
Displays connected to the microcontroller usually occupy a large number of valuable I/O pins, which can be a
big problem especially if it is needed to display multi digit numbers. The problem is more than obvious if, for
example, it is needed to display two 6-digit numbers (a simple calculation shows that 96 output pins are needed
in this case). The solution to this problem is called MULTIPLEXING. This is how an optical illusion based on
the same operating principle as a film camera is made. Only one digit is active at a time, but they change their
state so quickly making impression that all digits of a number are simultaneously active.

Here is an explanation on the figure above. First a byte representing units is applied on a microcontroller port
and a transistor T1 is activated at the same time. After a while, the transistor T1 is turned off, a byte representing
tens is applied on a port and a transistor T2 is activated. This process is being cyclically repeated at high speed
for all digits and corresponding transistors.

The fact that the microcontroller is just a kind of miniature computer designed to understand only the language
of zeros and ones is fully expressed when displaying any digit. Namely, the microcontroller doesn't know what
units, tens or hundreds are, nor what ten digits we are used to look like. Therefore, each number to be displayed
must be prepared in the following way:

First of all, a multi digit number must be split into units, tens etc. in a particular subroutine. Then each of these
digits must be stored in special bytes. Digits get familiar format by performing “masking”. In other words, a
binary format of each digit is replaced by a different combination of bits in a simple subroutine. For example, the
digit 8 (0000 1000) is replaced by the binary number 0111 111 in order to activate all LEDs displaying digit 8.
The only diode remaining inactive in this case is reserved for the decimal point. If a microcontroller port is
connected to the display in such a way that bit 0 activates segment “a”, bit 1 activates segment “b”, bit 2 segment
“c” etc., then the table below shows the “mask” for each digit.

Digits to display Display Segments


dp a b c d e f g
0 1 0 0 0 0 0 0 1
1 1 0 0 1 1 1 1 1
2 1 0 0 1 0 0 1 0
3 1 0 0 0 0 1 1 0
4 1 1 0 0 1 1 0 0
5 1 0 1 0 0 1 0 0
6 1 0 1 0 0 0 0 0
7 1 0 0 0 1 1 1 1
8 1 0 0 0 0 0 0 0
9 1 0 0 0 0 1 0 0
In addition to digits from 0 to 9, some letters of alphabet - A, C, E, J, F, U, H, L, b, c, d, o, r, t - can also be
displayed by performing appropriate masking.
If the event that common cathode displays are used all units in the table should be replaced by zeros and vice
versa. Additionally, NPN transistors should be used as drivers as well.

Example Program1: This Program counts up and down with two keys (UP and Down) are pressed up to
00 to 99 with 2digit seven segment Display is connected to LPC 2148

#include <LPC214x.H> /* LPC214x definitions */

#define SEG7_CTRL_DIR IO0DIR


#define SEG7_CTRL_SET IO0SET
#define SEG7_CTRL_CLR IO0CLR
#define LED_DATA_CLR IO0CLR
#define LED_DATA_SET IO0SET

#define DIG1 (1 << 10) //Digit 1 selection bit P1.10


#define DIG2 (1 << 11) //Digit 2 selection bit P1.11

#define KEY_CTRL_DIR IO1DIR


#define KEY_CTRL_SET IO1SET
#define KEY_CTRL_CLR IO1CLR
#define KEY_CTRL_PIN IO1PIN

#define INC (1 << 16) //KEY1 P1.16


#define DEC (1 << 20) //KEY5 P1.20

#define LED_DATA_MASK 0x007F8000


////////// digits are created using bit patterns corresponding to the segments ////////////////////
unsigned char dig[] = {0x88,0xeb,0x4c,0x49,0x2b,0x19,0x18,0xcb,0x8,0x9,0xa,0x38,0x98,0x68,0x1c,0x1e};

///////////////////////////////// MAIN ///////////////////////////////////////


int main (void)
{
unsigned char count=0;
unsigned char qot=0;
unsigned char rem=0;
unsigned short j=0;
unsigned short i=0;
KEY_CTRL_DIR &= ~(INC | DEC); // Set INC and DEC lines as Inputs
SEG7_CTRL_DIR |= ( DIG1 | DIG2); //Set Digit control lines as Outputs
SEG7_CTRL_CLR |= ( DIG1 | DIG2); //Clear Digit control lines

while(1)
{
if (!(KEY_CTRL_PIN & INC)) //INC key pressed
{
if (count == 99) goto disp;
count++;
goto disp;
}
if (!(KEY_CTRL_PIN & DEC)) //DEC key pressed
{
if (count == 0x00) goto disp;
count--;
}
disp:
qot = count / 10;
rem = count % 10;
for (i=0; i < 200; i++) //change to inc/dec speed of count
{
seg7_data_write (dig [qot]); //display quotient on digit1
SEG7_CTRL_SET |= DIG2;
for (j=0;j<500;j++); //change to inc/dec brightness of display
SEG7_CTRL_CLR |= DIG2;

seg7_data_write(dig[rem]); //display reminder on digit2


SEG7_CTRL_SET |= DIG1;
for (j=0;j<500;j++); //change to inc/dec brightness of display
SEG7_CTRL_CLR |= DIG1;
}
}

}
/****************************************************************************************
Function Name : seg7_data_write()
Description : Function to write data on the cathode segment lines
Input :
Output : Void
****************************************************************************************/
void seg7_data_write( unsigned char data )
{
unsigned int temp=0;
temp=(data << 15) & LED_DATA_MASK;
LED_DATA_CLR |= LED_DATA_MASK;
LED_DATA_SET |= temp;
}

Example Program2: This Program counts up and down with two keys (UP and Down) are pressed up to
0000 to 9999 with 4digit seven segment Display is connected to LPC 2148

#include <LPC214x.H> /* LPC214x definitions */

#define SEG7_CTRL_DIR IO0DIR


#define SEG7_CTRL_SET IO0SET
#define SEG7_CTRL_CLR IO0CLR
#define LED_DATA_CLR IO0CLR
#define LED_DATA_SET IO0SET

#define DIG1 (1 << 10) //Digit 1 selection bit P1.10


#define DIG2 (1 << 11) //Digit 2 selection bit P1.11
#define DIG3 (1 << 12) //Digit 3 selection bit P1.12
#define DIG4 (1 << 13) //Digit 4 selection bit P1.13

#define KEY_CTRL_DIR IO1DIR


#define KEY_CTRL_SET IO1SET
#define KEY_CTRL_CLR IO1CLR
#define KEY_CTRL_PIN IO1PIN
#define INC (1 << 16) //KEY1 P1.16
#define DEC (1 << 20) //KEY5 P1.20

#define LED_DATA_MASK 0x007F8000


////////// digits are created using bit patterns corresponding to the segments ////////////////////
unsigned char dig[] = {0x88,0xeb,0x4c,0x49,0x2b,0x19,0x18,0xcb,0x8,0x9,0xa,0x38,0x98,0x68,0x1c,0x1e};

///////////////////////////////// MAIN ///////////////////////////////////////


int main (void)
{
unsigned char count=0;
unsigned char qot=0;
unsigned char rem=0;
unsigned short j=0;
unsigned short i=0;
KEY_CTRL_DIR &= ~(INC | DEC); // Set INC and DEC lines as Inputs
SEG7_CTRL_DIR |= ( DIG1 | DIG2 | DIG3 | DIG4 ); //Set Digit control lines as Outputs
SEG7_CTRL_CLR |= ( DIG1 | DIG2 | DIG3 | DIG4 ); //Clear Digit control lines

while(1)
{
if (!(KEY_CTRL_PIN & INC)) //INC key pressed
{
if (count == 9999) goto disp;
count++;
goto disp;
}
if (!(KEY_CTRL_PIN & DEC)) //DEC key pressed
{
if (count == 0x0000) goto disp;
count--;
}
disp:
qot = count / 1000;
rem1= count % 1000;

qot1 = rem1 / 100;


rem2= rem1 % 100;

qot2 = count / 10;


rem = rem % 10;

for (i=0; i < 200; i++) //change to inc/dec speed of count


{
seg7_data_write (dig [qot); //display quotient on digit4
SEG7_CTRL_SET |= DIG4;
for (j=0;j<500;j++); //change to inc/dec brightness of display
SEG7_CTRL_CLR |= DIG4;

seg7_data_write(dig[qot1]); //display quotient1 on digit3


SEG7_CTRL_SET |= DIG3;
for (j=0;j<500;j++); //change to inc/dec brightness of display
SEG7_CTRL_CLR |= DIG3;

seg7_data_write(dig[qot2]); // display quotient2 on digit2


SEG7_CTRL_SET |= DIG1;
for (j=0;j<500;j++); //change to inc/dec brightness of display
SEG7_CTRL_CLR |= DIG1;

seg7_data_write(dig[rem]); // display remionder on digit1


SEG7_CTRL_SET |= DIG2;
for (j=0;j<500;j++); //change to inc/dec brightness of display
SEG7_CTRL_CLR |= DIG2;
}
}

}
/****************************************************************************************
Function Name : seg7_data_write()
Description : Function to write data on the cathode segment lines
Input :
Output : Void
****************************************************************************************/
void seg7_data_write( unsigned char data )
{
unsigned int temp=0;
temp=(data << 15) & LED_DATA_MASK;
LED_DATA_CLR |= LED_DATA_MASK;
LED_DATA_SET |= temp;
}

HEX Key Pad Interfacing:

The hex keypad is a peripheral that is organized in rows and Columns. Hex key Pad 16 Keys arranged in a
4 by 4 grid, labeled with the hexadecimal digits 0 to F. An exampl e of this can been seen in Figure 1,
below. Internally, the structure of the hex keypad is very simple. Wires run in vertical columns (we call
them C0 to C3) and in horizontal rows (called R0 to R3). These 8 wires are available externally, and will
be connected to the lower 8 bits of the port. Each key on the keypad is essentially a switch that connects a
row wire to a column wire. When a key is pressed, it makes an electrical connection between the row and
column. The internal structure of the hex keypad is shown in Fig

10k Array

3.3V

ROW1
ROW2
ROW3 S1 S2 S3 S4
ROW4
S5 S6 S7 S8

S9 S10 S11 S12

S13 S14 S15 S16

ROW1
ROW2
ROW3
ROW4

FIG() Hex key Pad


At this point, you may be wondering exactly where the signals on the hex keypad come from. The keys
just create a short between a row and column wire when pressed.
Reading Values from the Hex Keypad
It is tempting to view the hex keypad as a peripheral which just tells us which key was pressed, and all we
have to do is read the value via the GPIO port. This is the wrong view to take. The hex keypad is just a way for a
user to interact with the Microcontroller board. As described in the previous section, all the keypad does is make
electrical connections between rows and columns – it is up to your program to determine from that which key was
pressed. The hex keypad is connected to the Microcontroller through the GPIO parallel ports. We need to
remember a few things about the GPIO ports in order to read and interpret hex keypad input properly.
Recall that each pin can be configured individually as input or output. Furthermore, the port direction can
be reconfigured by your program, so that the inputs and outputs can be changed while your program is running.
Finally, remember that each pin in the HEX keypad is connected to a pull-up resistor, so any input coming from
the hex keypad will read a 1 by default (i.e. when a key is not being pressed). These facts, coupled with our
knowledge of how the row and column wires of the hex keypad are wired up to the Microcontroller board, will
allow us to determine which key has been pressed. If we treat all the hex keypad wires as inputs, we will always
read in a 0xFF, since there is nothing driving those wires – they are unconnected. Even when a key is pressed, the
effect is of connecting one input port to another, so the pull-up resistors will always output a 1.
The basic concept is that, since all the hex keypad wires are connected to GPIO ports, we need to use
some of those wires to output values, and some to read in values. If we output values onto the columns, say, then
when we read from the rows, if a key is pressed there will be a short between arrow and a column and we will read
in whatever value we have set the column to output. Rows in which no key is pressed will be unconnected, and
thus read in as a 1.
Consequently, in order to be able to differentiate between unconnected inputs and inputs for which a key has been
pressed so that they are reading the value put onto a column, we need to always output 0. Thus, we should write 0
to all of the column wires, and then read in from the row wires. If no key is pressed, the row wires will all be
unconnected, so that we will read a 1 value on pins 0 to 3. However, if a key is pressed, one of the row wires will
be shorted with a column wire, and will thus have whatever value is on that wire (i.e. 0). Thus, we can identify the
row of which key was pressed by reading the row values and finding which is 0.
This could also work if we treated the rows as outputs and the columns as inputs. In that case, we would output 0
to the rows, and read in from the columns, and whichever column wire was 0 would indicate the column in which
the pressed key resides. So, being able to identify the row or column of a pressed key helps, but still does not tell us
which key was pressed. The trick is that if you know both the row and the column, you can determine which key
was pressed from their intersection. This means that we must take advantage of the ability to change the direction
of the individual pins within our program.
First, we set one half of the lower 8 bits as input (bits 0 to 3,say) and the other as output (bits 4 to 7), and get one
value (the row, in this case), then we set them the other way around, get the other value, and then we determine
which key was pressed.
So, for example, if we read in that row wire R2 is 0 and column wire C3 is 0, we know that the B key on the hex
keypad was pressed. There are a number of different ways you can write code
that will figure out which key was pressed based on the row and column numbers.
Thus, to summarize, the following steps should be followed in order to determine which key on
the hex keypad has been pressed.
1. GPIO ports should be configured to have the pins connected to row wires R0 to R3
set as inputs. The pins connected to column wires C0 to C3 should be set as outputs. (That is, pins 0–3 are inputs,
and 4–7 are outputs.)
2. Now set MSB bit of column (column bits 3) to 0.Send 07(0111). Read the input (Row bits 0–3), if value is 0
indicates the row of the pressed key. Otherwise go to next Row.
3. Now set next bit of column (column bits 2) to 0.Send 0B (1011). Read the input (Row bits 0–3), if value is 0
indicates the row of the pressed key. Otherwise go to next Row.
4. Now set next bit of column (column bits 1) to 0.Send 0D (1101). Read the input (Row bits 0–3), if value is 0
indicates the row of the pressed key. Otherwise go to next Row.
5. Now set LSB bit of column (column bits 0) to 0.Send 0E (1110). Read the input (Row bits 0–3), if value is 0
indicates the row of the pressed key. Otherwise go to Step1.
Example Program:
#include <LPC214x.H> /* LPC214x definitions */
// Matrix Keypad Scanning Routine
//
// COL1 COL2 COL3 COL4
// 0 1 2 3 ROW 1
// 4 5 6 7 ROW 2
// 8 9 A B ROW 3
// C D E F ROW 4

#define SEG7_CTRL_DIR IO0DIR


#define SEG7_CTRL_SET IO0SET
#define SEG7_CTRL_CLR IO0CLR

#define COL1 (1 << 16) // Column 1 is connected to P1.16


#define COL2 (1 << 17) // Column 2 is connected to P1.17
#define COL3 (1 << 18) // Column 3 is connected to P1.18
#define COL4 (1 << 19) // Column 4 is connected to P1.19

#define ROW1 (1 << 20) // Row 1 is connected to P1.20


#define ROW2 (1 << 21) // Row 2 is connected to P1.21
#define ROW3 (1 << 22) // Row 3 is connected to P1.22
#define ROW4 (1 << 23) // Row 4 is connected to P1.23

#define COLMASK (COL1 | COL2 | COL3 | COL4)


#define ROWMASK (ROW1 | ROW2 | ROW3 | ROW4)

#define KEY_CTRL_DIR IO1DIR


#define KEY_CTRL_SET IO1SET
#define KEY_CTRL_CLR IO1CLR
#define KEY_CTRL_PIN IO1PIN

/////////////// COLUMN WRITE /////////////////////


void col_write( unsigned char data )
{
unsigned int temp=0;

temp=(data << 16) & COLMASK;


KEY_CTRL_CLR |= COLMASK;
KEY_CTRL_SET |= temp;
}
///////////////////////////////// MAIN ///////////////////////////////////////
int main (void)
{
unsigned char key, i;
unsigned char rval[] = {0x7,0xB,0xD,0xE,0x0};
unsigned char keyPadMatrix[] =
{
'C','8','4','0',
'D','9','5','1',
'E','A','6','2',
'F','B','7','3'
};

init_lcd();

KEY_CTRL_DIR |= COLMASK; //Set COLs as Outputs


KEY_CTRL_DIR &= ~(ROWMASK); // Set ROW lines as Inputs
while (1)
{
key = 0;
for( i = 0; i < 4; i++ )
{
// turn on COL output one by one
col_write(rval[i]);
// read rows - break when key press detected
if (!(KEY_CTRL_PIN & ROW1))
break;
key++;
if (!(KEY_CTRL_PIN & ROW2))
break;
key++;
if (!(KEY_CTRL_PIN & ROW3))
break;
key++;
if (!(KEY_CTRL_PIN & ROW4))
break;
key++;
}
if (key == 0x10)
lcd_putstring16(1,"Key Pressed = ");
else
{
lcd_gotoxy(1,14);
lcd_putchar(keyPadMatrix[key]);
}
}

Stepper Motor Interfacing:

Interfacing Stepper motor


A motor is one which translates electrical pulses into mechanical motion.
Types of motor are:
1. Stepper Motor
2. DC Motor
3. AC Motor
A stepper motor is a special type of electric motor that moves in increments, or steps, rather than
turning smoothly as a conventional motor does. Typical increments are 0.9 or 1.8 degrees, with 400 or 200
increments thus representing a full circle. The speed of the motor is determined by the time delay between each
incremental movement.
Two types of stepper motor are:
1. Permanent Magnet (PM) 2. Variable Reluctance (VR)
 Motor Moves Each Time a Pulse is Received
 Can Control Movement (Direction and Amount) Easily
 Can Force Motor to Hold Position Against an Opposing Force

Construction
 Permanent Magnet Rotor ,Also Called the Shaft, Stator Surrounds the Shaft, Usually Four
Stator Windings Paired with Center-Tapped Moving the Rotor
Single-Coil Excitation - Each successive coil is energised in turn.

Two-Coil Excitation - Each successive pair of adjacent coils is energized in turn.


0 0 0

0 0

0 0

0 0

0 0

0 0
Step Angle
 Arc Through Which Motor Turns With ONE Step Change of the Windings
 Varies With Model of Stepper Motor(Depending on the number of teeth on stator and rotor)
 Normally in Degrees
 Step angle = 360/No. of Steps per Revolution
 Commonly available no. of steps per revolution are 500, 200, 180, 144, 72, 48, 24
 Revolutions per Minute (RPM)
60  Steps per Second
rpm 
Steps per Re volution

The top electromagnet The top electromagnet The bottom The left electromagnet (4) is
(1) is turned on, (1) is turned off, and the Electromagnet (3) enabled, rotating again by 3.6°
attracting the nearest right electromagnet (2) is energized; another (1.8’). When the top
teeth of a gear shaped is energized, pulling the 3.6° (1.8’) rotation electromagnet (1) is again
iron rotor. With the nearest teeth slightly to occurs. enabled, the teeth in the sprocket
teeth aligned to the right. This results in a will have rotated by one tooth
electromagnet 1, they rotation of 3.6° (1.8’) in position; since there are 25(50)
will be slightly offset this example. teeth, it will take 100(200) steps to
from electromagnet 2. make a full rotation in this
example
Example program:

#include <LPC214x.H> /* LPC214x definitions */


#define MOTOR_CTRL_DIR IO1DIR
#define MOTOR_CTRL_SET IO1SET
#define MOTOR_CTRL_CLR IO1CLR
#define MOTOR_MASK 0x0F000000 // P1.24-P1.27
#define KEY_CTRL_DIR IO1DIR
#define KEY_CTRL_SET IO1SET
#define KEY_CTRL_CLR IO1CLR
#define KEY_CTRL_PIN IO1PIN
#define START (1 << 16) //KEY1 =P1.16
#define STOP (1 << 20) //KEY5 =P1.20
#define CLK (1 << 17) //KEY2 =P1.17
#define ACLK (1 << 21) //KEY6 =P1.21
#define INC (1 << 18) //KEY3=P1.18
#define DEC (1 << 22) //KEY7 =P1.22

/////////////// MOTOR WRITE /////////////////////


void motor_write( unsigned char data )
{
unsigned int temp=0;
temp=(data << 24) & MOTOR_MASK;
MOTOR_CTRL_CLR |= MOTOR_MASK;
MOTOR_CTRL_SET |= temp;
}
////////////////////// MAIN /////////////////////////////////////
int main (void)
{
unsigned char stpval=1;
unsigned char run=1;
unsigned char dir=0;
unsigned int del=100;

PINSEL2 = 0x0; // to ensure RTCK/P1.23 behaves like GPIO, no accidental JTAG


MOTOR_CTRL_DIR |= MOTOR_MASK;
KEY_CTRL_DIR &= ~(START | STOP | CLK | ACLK | INC | DEC); // Set these lines as Inputs
motor_write(stpval);
while(1)
{
if (!(KEY_CTRL_PIN & START)) //START key pressed
run = 1;
if (!(KEY_CTRL_PIN & STOP)) //STOP key pressed
run = 0;
if (!(KEY_CTRL_PIN & CLK)) //CLK key pressed
dir = 0;
if (!(KEY_CTRL_PIN & ACLK)) //ACLK key pressed
dir = 1;
if (!(KEY_CTRL_PIN & INC)) //INC key pressed
if (del != 100) del = del - 2;
if (!(KEY_CTRL_PIN & DEC)) //DEC key pressed
if (del != 10000) del = del + 2;
if (run == 1)
{
if (dir == 0)
{
if (stpval == 8) //rotate step value bitwise left
stpval = 1;
else
stpval <<= 1;
}
else
{
if (stpval == 1) //rotate step value bitwise right
stpval = 8;
else
stpval >>= 1;
}
motor_write(stpval);
delay(del);
}
}
}

DC Motor Interfacing:

Unlike stepper motor, the DC (direct current) motor rotates continuously. It has two terminals positive and
negative. Connecting DC power supply to these terminals rotates motor in one direction and reversing the
polarity of the power supply reverses the direction of rotation.

The speed of the DC motor is measured in revolution per minute (RPM). The speed of the DC motor increases
with increase in the supply voltage. However, we cannot exceed supply voltage beyond the rated voltage.
UNIDIRECTION CONTROL:

Figure shows the DC motor rotation for clockwise (CW) and counterclockwise (CCW) rotations. It also
shows how DC motor changes its direction of rotation when the polarity of power supply reverses.

BIDIRECTIONAL CONTROL
By using switches for changing the power supply polarity we can control the direction of the rotation of the DC
motor. This is illustrated in the below figure.
H-Bridge Is Simplest Method
– Uses Switches (Relays Will do)
Below table shows some of the switch configurations.
SW3 SW2 SW1 SW0 MOTOR ROTATION
OPEN OPEN OPEN OPEN OFF (FIG A)
CLOSED OPEN OPEN CLOSED CLOCKWISE (FIGB)
OPEN CLOSED CLOSED OPEN ANTICLOCKWISE(FIG C)
CLOSED CLOSED CLOSED CLOSED INVALID(FIG D)

Pulse Width Modulation (PWM):

The speed of the DC motor also depends on the load. At no-load speed is highest. As we increase the load, the
speed decreases. The speed of Dc motor can maintained at a constant speed for a given load by using “Pulse
Width Modulation (PWM)” technique. By changing the width of the pulse of applied to dc motor, the power
applied is varied thereby DC motor speed can be increased or decreased. Wider the pulse Faster is the Speed,
Narrower is the Pulse, and Slower is the Speed.LPC 2148 has built in PWM. For microcontrollers without PWM
circuit we generate different duty cycles using software’s.

Example1. Program to control the direction of the DC motor according to the status of bit P0.7
Asuume port pins P1.16, P1.20, P1.17 and P1.21are connected to keys to start, stop, speed increase and speed
decrease.

#include <LPC214x.H> /* LPC214x definitions */


#define MOTOR_CTRL_DIR IO1DIR
#define MOTOR_CTRL_SET IO1SET
#define MOTOR_CTRL_CLR IO1CLR
#define MOTOR_EN_DIR IO0DIR
#define MOTOR_EN_SET IO0SET
#define MOTOR_EN_CLR IO0CLR
#define MOTOR_MASK 0x0F000000 // P1.24-P1.27 motor pins
#define MOTOR_ENABLE (1 << 7) //MTC1 =P0.7 controls the motor direction
#define KEY_CTRL_DIR IO1DIR
#define KEY_CTRL_SET IO1SET
#define KEY_CTRL_CLR IO1CLR
#define KEY_CTRL_PIN IO1PIN
#define START (1 << 16) //KEY1=P1.16
#define STOP (1 << 20) //KEY5=P1.20
#define INC (1 << 17) //KEY2=P1.17
#define DEC (1 << 21) //KEY6=P1.21

/////////////// MOTOR WRITE /////////////////////


void motor_write( unsigned char data )
{
unsigned int temp=0;
temp=(data << 24) & MOTOR_MASK;
MOTOR_CTRL_CLR |= MOTOR_MASK;
MOTOR_CTRL_SET |= temp;
}
////////////////////// MAIN /////////////////////////////////////
int main (void)
{
unsigned int del=0;
unsigned int i=0;
PINSEL2 = 0x0; // to ensure RTCK/P1.23 behaves like GPIO, no accidental JTAG
MOTOR_CTRL_DIR |= MOTOR_MASK;
MOTOR_EN_DIR |= MOTOR_ENABLE;
MOTOR_EN_CLR |= MOTOR_ENABLE; // switch OFF both Motors
KEY_CTRL_DIR &= ~(START | STOP | INC | DEC); // Set these lines as Inputs
delay(30000);
while(1)
{
if (!(KEY_CTRL_PIN & START)) //START key pressed
motor_write(0x05); // Motor CCW : for CW write 0x0A
MOTOR_EN_SET |= MOTOR_ENABLE; // switch ON both Motors
if (!(KEY_CTRL_PIN & STOP)) //STOP key pressed
motor_write(0x0F); // data on both control lines will Brake the motor
MOTOR_EN_CLR |= MOTOR_ENABLE; // switch OFF both Motors
if (!(KEY_CTRL_PIN & INC)) //INC key pressed
if (del != 0)
{
del = del - 1;
delay(10);
}
if (!(KEY_CTRL_PIN & DEC)) //DEC key pressed
if (del != 10000)
{
del = del + 1;
delay(10);
}
if (del == 0)
MOTOR_EN_SET |= MOTOR_ENABLE; // switch ON both Motors
else
{
MOTOR_EN_CLR |= MOTOR_ENABLE; // switch OFF both Motors
for (i=0; i < del; i++);
MOTOR_EN_SET |= MOTOR_ENABLE; // switch ON both Motors
}
}
}
APPENDIX A

Alphabetical list of ARM instructions

Every ARM instruction is listed on the following pages. Each instruction description shows:
• The instruction encoding
• The instruction syntax
• The version of the ARM architecture where the instruction is valid
• Any exceptions that apply
• An example in pseudo-code of how the instruction operates
• Notes on usage and special cases.
General Notes
These notes explain the types of information and abbreviations used on the instruction pages.
Syntax abbreviations
The following abbreviations are used in the instruction pages:
(immed n) This is an immediate value, where n is the number of bits. For example, an 8-bit immediate value is
represented by: (immed 8 )
(offset n) This is an offset value, where n is the number of bits. For example, an 8-bit offset value is represented
by:(offset 8)
The same construction is used for signed offsets. For example, an 8-bit signed offset is represented by: (signed
offset 8)
Encoding diagram and assembler syntax
For the conventions used, see Assembler syntax descriptions.
Architecture versions
This gives details of architecture versions where the instruction is valid. For details, see Architecture versions and
variants.
Exceptions
This gives details of which exceptions can occur during the execution of the instruction. Prefetch Abort is not
listed in general, both because it can occur for any instruction and because if an abort occurred during instruction
fetch, the instruction bit pattern is not known. (Prefetch Abort is however listed for BKPT, since it can generate a
Prefetch Abort exception without these considerations applying.)
Operation
This gives a pseudo-code description of what the instruction does. For details of conventions used in this
pseudo-code, see Pseudo-code descriptions of instructions.)
Information on usage
Usage sections are included where appropriate to supply suggestions and other information about how to use the
instruction effectively

.
ADC Add with Carry

Operation (cc): Rd Rn + (op1) + CPSR(C)


(cc)(S): CPSR ALU (Flags)

Syntax ADC (c) (S) Rd, Rn, (op1)

Description The ADC (Add with Carry) instruction adds the value of (op1) and the Carry flag to the value of
Rn and stores the result in Rd. The condition code flags are optionally updated based on the
result.

Usage ADC is used to synthesize multi-word addition. If register pairs R0, R1 and R2, R3 hold 64-bit
values (where R0 and R2 hold the least significant words) the following instructions leave the
64-bit sum in R4, R5:

ADDS R4, R0, R2


ADC R5, R1, R3
If the second instruction is changed from:
ADC R5, R1, R3
to:
ADCS R5, R1, R3
the resulting values of the flags indicate:
N The 64-bit addition produced a negative result.
C An unsigned overflow occurred.
V A signed overflow occurred.
Z The most significant 32 bits are all zero.
The following instruction produces a single-bit Rotate Left with Extend operation (33-bit rotate
through the Carry flag) on R0:
ADCS R0, R0, R0

ADD Add

Operation (cc): Rd Rn + (op10


(cc)(S): CPSR ALU (Flags)
Syntax ADD (cc) (S) Rd, Rn, (op1)
Description Adds the value of (op1) to the value of register Rn, and stores the result in the destination
register Rd. The condition code flags are optionally updated based on the result.
Usage. The ADD instruction is used to add two values together to produce a third.
To increment a register value in Rx use:
ADD Rx, Rx, #1
Constant multiplication of Rx by 2n + 1 into Rd can be performed with:
ADD Rd, Rx, Rx, LSL #n
To form a PC-relative address use:
ADD Rs, PC, #offset
where the (offset) must be the difference between the required address and the address held in
the PC, where the PC is the address of the ADD instruction itself plus 8 bytes.
Condition Codes
The N and Z flags are set according to the result of the addition, and the C and V flags are set
according to whether the addition generated a carry (unsigned overflow) and a signed overflow,
respectively.

AND Bitwise AND

Operation (cc): Rd Rn ^ (op1)


(cc)(S): CPSR ALU (Flags)
Syntax AND (cc) (S) Rd, Rn, (op1)
Description The AND instruction performs a bitwise AND of the value of register Rn with the value of
(op1), and stores the result in the destination register Rd. The condition code flags are optionally
updated based on the result.
Usage AND is most useful for extracting a field from a register, by ANDing the register with a mask
value that has 1s in the field to be extracted, and 0s elsewhere.
Condition Codes
The N and Z flags are set according to the result of the operation, and the C flag is set to the
carry output generated by (op1) (see 5.1 on page 59) The V flag is unaffected.

B, BL Branch, Branch and Link

Operation (cc)(L): LR PC + 8
(cc): PC PC + (offset0
Syntax B (L) (cc) (offset)

Description The B (Branch) and BL (Branch and Link) instructions cause a branch to a target address, and
provide both conditional and unconditional changes to program flow. The BL (Branch and
Link) instruction stores a return address in the link register (LR or R14).
The (offset) specifies the target address of the branch. The address of the next instruction is
calculated by adding the offset to the program counter (PC) which contains the address of the
branch instruction plus 8.
The branch instructions can specify a branch of approximately ±32MB.
Usage The BL instruction is used to perform a subroutine call. The return from subroutine is achieved
by copying the LR to the PC. Typically, this is done by one of the following methods:
• Executing a MOV PC, R14 instruction.
• Storing a group of registers and R14 to the stack on subroutine entry, using an instruction of
the form:
STMFD R13!,{(registers),R14}
and then restoring the register values and returning with an instruction of the form:
LDMFD R13!,{(registers),PC}
Condition Codes
The condition codes are not affected by this instruction.
Notes branching backwards past location zero and forwards over the end of the 32-bit address
space is UNPREDICTABLE.

CMP Compare

Operation (cc): ALU(0) Rn - (op1)


(cc): CSPR ALU(Flags)
Syntax CMP(cc) Rn, (op1)
Description The CMP (Compare) instruction compares a register value with another arithmetic value. The
condition flags are updated, based on the result of subtracting (op1) from Rn, so that subsequent
instructions can be conditionally executed.
Condition Codes
The N and Z flags are set according to the result of the subtraction, and the C and V flags are set
according to whether the subtraction generated a borrow (unsinged underflow) and a signed
overflow, respectively.

EOR Exclusive OR

Operation (cc): Rd Rn _ (op1)


(cc)(S): CPSR ALU(Flags)
Syntax EOR(cc)(S) Rd, Rn, (op1)
Description The EOR (Exclusive OR) instruction performs a bitwise Exclusive-OR of the value of register
Rn with the value of (op1), and stores the result in the destination register Rd. The condition
code flags are optionally updated, based on the result.
Usage EOR can be used to invert selected bits in a register. For each bit, EOR with 1 inverts that bit,
and EOR with 0 leaves it unchanged.
Condition Codes
The N and Z flags are set according to the result of the operation, and the C flag is set to the
carry output bit generated by the shifter. The V flag is unaffected.

LDM Load Multiple

Operation if (cc)
IA: addr Rn
IB: addr Rn + 4
DA: addr Rn - (#(registers) * 4) + 4
DB: addr Rn - (#(registers) * 4)
for each register Ri in (registers)
IB: addr addr + 4
DB: addr addr - 4
Ri M(addr)
IA: addr addr + 4
DA: addr addr - 1
(!): Rn addr
Syntax LDM(cc)(mode) Rn(!), {(registers)}
Description The LDM (Load Multiple) instruction is useful for block loads, stack operations and procedure
exit sequences. It loads a subset, or possibly all, of the general purpose registers from sequential
memory locations. The general purpose registers loaded can include the PC. If they do, the word
loaded for the PC is treated as an address and a branch occurs to that address. The register Rn
points to the memory local to load the values from. Each of the registers listed in (registers) is
loaded in turn, reading each value from the next memory address as directed by (mode), one of:
IB Increment Before
DB Decrement Before
IA Increment After
DA Decrement After
The base register writeback option ((!)) causes the base register to be modified to hold the
address of the final valued loaded. The register are loaded in sequence, the lowest-numbered
register from the lowest memory address, through to the highest-numbered register from the
highest memory address.
If the PC (R15) is specified in the register list, the instruction causes a branch to the address
loaded into the PC.
Exceptions Data Abort
Condition Codes
The condition codes are not effected by this instruction.
Notes If the base register Rn is specified in (registers), and base register writeback is specified ((!)),
the final value of Rn is UNPREDICTABLE.

LDR Load Register

Operation (cc): Rd M((op2))


Syntax LDR(cc) Rd, (op2)
Description The LDR (Load Register) instruction loads a word from the memory address calculated by
(op1) and writes it to register Rd. If the PC is specified as register Rd, the instruction loads a
data word which it treats as an address, then branches to that address.
Exceptions Data Abort

Usage Using the PC as the base register allows PC-relative addressing, which facilitates position
independent code. Combined with a suitable addressing mode, LDR allows 32-bit memory data
to be loaded into a general-purpose register where its value can be manipulated. If the
destination register is the PC, this instruction loads a 32-bit address from memory and branches
to that address. To synthesize a Branch with Link, precede the LDR instruction with MOV LR,
PC.
Condition Codes
The condition codes are not affected by this instruction.
Notes If (op2) specifies an address that is not word-aligned, the instruction attempts to load a byte. The
result is UNPREDICTABLE and the LDRB instruction should be used. If (op2) specifies base
register write back (!), and the same register is specified for Rd and Rn, the results are
UNPREDICTABLE.
If the PC (R15) is specified for Rd, the value must be word aligned otherwise the result is
UNPREDICTABLE.

LDRB Load Register Byte

Operation (cc): Rd(7:0) M((op2))


(cc): Rd(31:8) 0
Syntax LDR(cc)B Rd, (op2)
Description The LDRB (Load Register Byte) instruction loads a byte from the memory address calculated
by (op2), zero-extends the byte to a 32-bit word, and writes the word to register Rd.
Exceptions Data Abort
Usage LDRB allows 8-bit memory data to be loaded into a general-purpose register where it can be
manipulated. Using the PC as the base register allows PC-relative addressing, to facilitate
position independent code.
Condition Codes
The condition codes are not affected by this instruction.
Notes If the PC (R15) is specified for Rd, the result is UNPREDICTABLE.
If (op2) specifies base register write back (!), and the same register is specified for Rd and Rn,
the results are UNPREDICTABLE.

MOV Move

Operation (cc): Rd (op1)


(cc)(S): CPSR ALU(Flags)
Syntax MOV(cc)(S) Rd, (op1)
Description The MOV (Move) instruction moves the value of (op1) to the destination register Rd. The
condition code flags are optionally updated, based on the result.
Usage MOV is used to:
• Move a value from one register to another.
• Put a constant value into a register.
• Perform a shift without any other arithmetic or logical operation. A left shift by n can be used
to multiply by 2n.
• When the PC is the destination of the instruction, a branch occurs. The instruction:
MOV PC, LR
Can therefore be used to return from a subroutine (see instructions B, and BL).
Condition Codes
The N and Z flags are set according to the value moved (post-shift if a shift is specified), and the
C flag is set to the carry output bit generated by the shifter
The V flag is unaffected.

MVN Move Negative

Operation (cc): Rd (op1)


(cc)(S): CPSR ALU (Flags)
Syntax MVN (cc) (S) Rd, (op1)
Description The MVN (Move Negative) instruction moves the logical one’s complement of the value of
(op1) to the destination register Rd. The condition code flags are optionally updated based on
the result.
Usage MVN is used to:
• Write a negative value into a register.
• Form a bit mask.
• Take the one’s complement of a value.
Condition Codes
The N and Z flags are set according to the result of the operation, and the C flag is set to the
carry output bit generated by the shifter (see 5.1 on page 59). The V flag is unaffected.

ORR Bitwise OR

Operation (cc): Rd Rn _ (op1)


(cc)(S): CPSR ALU(Flags)
Syntax ORR(cc)(S) Rd, Rn, (op1)
Description The ORR (Logical OR) instruction performs a bitwise (inclusive) OR of the value of register Rn
with the value of (op1), and stores the result in the destination register Rd. The condition code
flags are optionally updated, based on the result.
Usage ORR can be used to set selected bits in a register. For each bit, OR with 1 sets the bit, and OR
with 0 leaves it unchanged.
Condition Codes
The N and Z flags are set according to the result of the operation, and the C flag is set to the
carry output bit generated by the shifter (see 5.1 on page 59). The V flag is unaffected.

SBC Subtract with Carry

Operation (cc): Rd Rn - (op1) - NOT(CPSR(C))


(cc)(S): CPSR ALU(Flags)
Syntax SBC(cc)(S) Rd, Rn, (op1)
Description The SBC (Subtract with Carry) instruction is used to synthesize multi-word subtraction. SBC
subtracts the value of (op1) and the value of NOT(Carry flag) from the value of register Rn, and
stores the result in the destination register Rd. The condition code flags are optionally updated,
based on the result.
Usage If register pairs R0,R1 and R2,R3 hold 64-bit values (R0 and R2 hold the least significant
words), the following instructions leave the 64-bit difference in R4,R5:
SUBS R4,R0,R2
SBC R5,R1,R3
Condition Codes
The N and Z flags are set according to the result of the subtraction, and the C and V flags are set
according to whether the subtraction generated a borrow (unsigned underflow) and a signed
overflow, respectively.
Notes If (S) is specified, the C flag is set to:
0 if no borrow occurs
1 if a borrow does occur
In other words, the C flag is used as a NOT(borrow) flag. This inversion of the borrow condition
is usually compensated for by subsequent instructions. For example:
• The SBC and RSC instructions use the C flag as a NOT(borrow) operand, performing a normal
subtraction if C == 1 and subtracting one more than usual if C == 0.
• The HS (unsigned higher or same) and LO (unsigned lower) conditions are equivalent to CS
(carry set) and CC (carry clear) respectively.

STM Store Multiple

Operation if (cc)
IA: addr Rn
IB: addr Rn + 4
DA: addr Rn - (#(registers) * 4) + 4
DB: addr Rn - (#(registers) * 4)
for each register Ri in (registers)
IB: addr addr + 4
DB: addr addr - 4
M(addr) Ri
IA: addr addr + 4
DA: addr addr - 4
(!): Rn addr
Syntax STM (cc)(mode) Rn(!), {(registers)}
Description The STM (Store Multiple) instruction stores a subset (or possibly all) of the general-purpose
registers to sequential memory locations. The register Rn specifies the base register used to
store the registers. Each register given in R registers is stored in turn, storing each register in the
next memory address as directed by (mode), which can be one of:
IB Increment Before
DB Decrement Before
IA Increment After
DA Decrement After
If the base register writeback option ((!)) is specified, the base register (Rn) is modified with the
new base address. (registers) is a list of registers, separated by commas and specifies the set of
registers to be stored. The registers are stored in sequence, the lowest-numbered register to the
lowest memory address, through to the highest-numbered register to the highest memory
address. If R15 (PC) is specified in (registers), the value stored is UNKNOWN.
Exceptions Data Abort
Usage STM is useful as a block store instruction (combined with LDM it allows efficient block copy)
and for stack operations. A single STM used in the sequence of a procedure can push the return
address and general-purpose register values on to the stack, updating the stack pointer in the
process.
Condition Codes
The condition codes are not effected by this instruction.
Notes If R15 (PC) is given as the base register (Rn), the result is UNPREDICTABLE.
If Rn is specified as (registers) and base register writeback ((!)) is specified:
• If Rn is the lowest-numbered register specified in (registers), the original value of Rn is stored.
• Otherwise, the stored value of Rn is UNPREDICTABLE.
The value of Rn should be word alligned.

STR Store Register

Operation (cc): M((op2)) Rd


Syntax STR(cc) Rd, (op2)
Description The STR (Store Register) instruction stores a word from register Rd to the memory address
calculated by (op2).
Exceptions Data Abort
Usage Combined with a suitable addressing mode, STR stores 32-bit data from a general-purpose
register into memory. Using the PC as the base register allows PC-relative addressing, which
facilitates position-independent code.
Condition Codes
The condition codes are not affected by this instruction.
Notes Using the PC as the source register (Rd) will cause an UNKNOWN value to be written. If (op2)
specifies base register write back (!), and the same register is specified for Rd and Rn, the results
are UNPREDICTABLE. The address calculated by (op2) must be word-aligned. The result of a
store to a nonword-aligned address is UNPREDICTABLE.

STRB Store Register Byte

Operation (cc): M((op2)) Rd(7:0)


Syntax STR(cc)B Rd, (op2)
Description The STRB (Store Register Byte) instruction stores a byte from the least significant byte of
register Rd to the memory address calculated by (op2).
Exceptions Data Abort
Usage Combined with a suitable addressing mode, STRB writes the least significant byte of a
general-purpose register to memory. Using the PC as the base register allows PC-relative
addressing, which facilitates position-independent code.
Condition Codes
The condition codes are not affected by this instruction.
Notes Specifying the PC as the source register (Rd) is UNPREDICTABLE.
If (op2) specifies base register write back (!), and the same register is specified for Rd and Rn,
the results are UNPREDICTABLE.

SUB Subtract

Operation (cc): Rd Rn - (op1)


(cc)(S): CPSR ALU(Flags)
Syntax SUB(cc)(S) Rd, Rn, (op1)
Description Subtracts the value of (op1) from the value of register Rn, and stores the result in the destination
register Rd. The condition code flags are optionally updated, based on the result.
Usage SUB is used to subtract one value from another to produce a third. To decrement a register value
(in Rx) use:
SUBS Rx, Rx, #1
SUBS is useful as a loop counter decrement, as the loop branch can test the flags for the
appropriate termination condition, without the need for a compare instruction:
CMP Rx, #0
This both decrements the loop counter in Rx and checks whether it has reached zero.
Condition Codes
The N and Z flags are set according to the result of the subtraction, and the C and V flags are set
according to whether the subtraction generated a borrow (unsigned underflow) and a signed
overflow, respectively.
Notes If (S) is specified, the C flag is set to:
1 if no borrow occurs
0 if a borrow does occur
In other words, the C flag is used as a NOT(borrow) flag. This inversion of the borrow condition
is usually compensated for by subsequent instructions. For example:
• The SBC and RSC instructions use the C flag as a NOT(borrow) operand, performing a normal
subtraction if C == 1 and subtracting one more than usual if C == 0.
• The HS (unsigned higher or same) and LO (unsigned lower) conditions are equivalent to CS
(carry set) and CC (carry clear) respectively.

SWI Software Interrupt

Operation (cc): R14 svc PC + 8


(cc): SPSR svc CPSR
(cc): CPSR (mode) Supervisor
(cc): CPSR (I) 1 (Disable Interrupts)
(cc): PC 0x00000008
Syntax SWI (cc) (value)
Description Causes a SWI exception.
Exceptions Software interrupt
Usage The SWI instruction is used as an operating system service call. The method used to select
which operating system service is required is specified by the operating system, and the SWI
exception handler for the operating system determines and provides the requested service.
Two typical methods are:
• (value) specifies which service is required, and any parameters needed by the selected service
are passed in general-purpose registers.
• (value) is ignored, general-purpose register R0 is used to select which service is wanted, and
any parameters needed by the selected service are passed in other general-purpose registers.
Condition Codes
The flags will be effected by the operation of the software interrupt. It is not possible to say how
they will be effected. The status of the condition code flags is unknown after a software interrupt
is UNKNOWN.

SWP Swap

Operation (cc): ALU (0) M (Rn)


(cc): M (Rn) Rm
(cc): Rd ALU (0)
Syntax SWP (cc) Rd, Rm, [Rn]
Description Swaps a word between registers and memory. SWP loads a word from the memory address
given by the value of register Rn. The value of register Rm is then stored to the memory address
given by the value of Rn, and the original loaded value is written to register Rd. If the same
register is specified for Rd and Rm, this instruction swaps the value of the register and the value
at the memory address.
Exceptions Data Abort
Usage The SWP instruction can be used to implement semaphores. For sample code, see Semaphore
instructions.
Condition Codes
The condition codes are not effected by this instruction.
Notes If the address contained in Rn is non word-aligned the effect is UNPREDICTABLE. If the PC is
specified as the destination (Rd), address (Rn) or the value (Rm), the result is
UNPREDICTABLE. f the same register is specified as (Rn) and (Rm), or (Rn) and (Rd), the
result is UNPREDICTABLE.
If a data abort is signaled on either the load access or the store access, the loaded value is not
written to (Rd). If a data abort is signaled on the load access, the store access does not occur.

SWPB Swap Byte

Operation (cc): ALU (0) M (Rn)


(cc): M (Rn) Rm (7:0)
(cc): Rd (7:0) ALU (0)
Syntax SWP (cc) B Rd, Rm, [Rn]
Description Swaps a byte between registers and memory. SWPB loads a byte from the memory address
given by the value of register Rn. The value of the least significant byte of register Rm is stored
to the memory address given by Rn, the original loaded value is zero-extended to a 32-bit word,
and the word is written to register Rd. If the same register is specified for Rd and Rm, this
instruction swaps the value of the least significant byte of the register and the byte value at the
memory address.
Exceptions Data Abort
Usage The SWPB instruction can be used to implement semaphores, in a similar manner to that shown
for SWP instructions in Semaphore instructions.
Condition Codes
The condition codes are not affected by this instruction.
Notes If the PC is specified for Rd, Rn, or Rm, the result is UNPREDICTABLE.
If the same register is specified as Rn and Rm, or Rn and Rd, the result is UNPREDICTABLE.
If a data abort is signaled on either the load access or the store access, the loaded value is not
written to (Rd). If a data abort is signaled on the load access, the store access does not occur.
APPENDIX B

ARM Instruction Summary

Table b.1 ARM Instruction Condition field codes table

cc: Condition Codes


Generic Unsigned Signed
CS Carry Set HI Higher Than GT Greater Than
CC Carry Clear HS Higher or Same GE Greater Than or Equal
EQ Equal (Zero Set) LO Lower Than LT Less Than
NE Not Equal (Zero Clear) LS Lower Than or Same LE Less Than or Equal
VS Overflow Set MI Minus (Negative)
VC Overflow Clear PL Plus (Positive)

Table b.2 ARM Instruction Operand1 field table for data Access

op1: Data Access


Immediate #(value) (op1) = IR(value)
Register Rm (op1) = Rm
Logical Shift Left Immediate Rm, LSL #(value) (op1) = Rm _ IR(value)
Logical Shift Left Register Rm, LSL Rs (op1) = Rm _ Rs(7:0)
Logical Shift Right Immediate Rm, LSR #(value) (op1) = Rm _ IR(value)
Logical Shift Right Register Rm, LSR Rs (op1) = Rm _ Rs(7:0)
Arithmetic Shift Right Immediate Rm, ASR #(value) (op1) = Rm +_ IR(value)
Arithmetic Shift Right Register Rm, ASR Rs (op1) = Rm +_ Rs(7:0)
Rotate Right Immediate Rm, ROR #(value) (op1) = Rm >_ (value)
Rotate Right Register Rm, ROR Rs (op1) = Rm >_ Rs(4:0)
Rotate Right with Extend Rm, RRX (op1) = CPSR(C) >_ Rm >_ CPSR(C)
Table b.1 ARM Instruction Operand2 field table for Memory Access

op2: Memory Access


Immediate Offset [Rn, #±(value)] (op2) = Rn + IR(value)
Register Offset [Rn, Rm] (op2) = Rn + Rm
Scaled Register Offset [(Rn), Rm, (shift) #(value)] (op2) = Rn + (Rm shift IR(value))
Immediate Pre-indexed [Rn, #±(value)]! (op2) = Rn + IR(value)
Rn = (op2)
Register Pre-indexed [Rn, Rm]! (op2) = Rn + Rn
Rn = (op2)
Scaled Register Pre-indexed [Rn, Rm, (shift) #(value)]! (op2) = Rn + (Rm shift IR(value))
Rn = (op2)
Immediate Post-indexed [Rn], #±(value) (op2) = Rn
Rn = Rn + IR(value)
Register Post-indexed [Rn], Rm (op2) = Rn
Rn = Rn + Rm
Scaled Register Post-indexed [Rn], Rm, (shift) #(value) (op2) = Rn
Rn = Rn + Rm shift IR(value)
Where (shift) is one of: LSL, LSR, ASR, ROR or RRX and has the same effect as for (op1)

Table b.1 ARM Instruction table

ARM Instructions
Add with Carry ADC(cc)(s) Rd, Rn, (op1) (cc) : Rd = Rn + (op1) + CPSR(C)
Add ADD(cc)(s) Rd, Rn, (op1) (cc) : Rd = Rn + (op1)
Bitwise AND AND(cc)(s) Rd, Rn, (op1) (cc) : Rd = Rn ^ (op1)
Branch B(cc) (offset) (cc) : PC = PC + (offset)
Branch and Link BL(cc) (offset) (cc) : LR = PC + 8
(cc) : PC = PC + (offset)
Compare CMP(cc) Rn, (op1) (cc) : CSPR = (Rn - (op1))
Exclusive OR EOR(cc)(s) Rd, Rn, (op1) (cc) : Rd = Rn _ (op1)
Load Register LDR(cc) Rd, (op2) (cc) : Rd = M((op2))
Load Register Byte LDR(cc)B Rd, (op2) (cc) : Rd(7:0) = M((op2))
(cc) : Rd(31:8) = 0
Move MOV(cc)(s) Rd, (op1) (cc) : Rd = (op1)
Move Negative MVN(cc)(s) Rd, (op1) (cc) : Rd = (op1)
Bitwise OR ORR(cc)(s) Rd, Rn, (op1) (cc) : Rd = Rn _ (op1)
Subtract with Carry SBC(cc)(s) Rd, Rn, (op1) (cc) : Rd = Rn - (op1) - CPSR(C)
Store Register STR(cc) Rd, (op2) (cc) : M((op2)) = Rd
Store Register Byte STR(cc)(s) Rd, (op2) (cc) : M((op2)) = Rd(7:0)
Subtract SUB(cc)(s) Rd, Rn, (op1) (cc) : Rd = Rn - (op1)
Software Interrupt SWI(cc) (value)
Swap SWP(cc) Rd, Rm, [Rn] (cc) : Rd = M(Rn)
(cc) : M(Rn) = Rm
Swap Byte SWP(cc) B Rd, Rm, [Rn] (cc) : Rd(7:0) = M(Rn)(7:0)
(cc) : M(Rn)(7:0) = Rm(7:0)
APPENDIX C
ARM7TDMI -Instruction Encoding
The ARM7TDMI uses a fixed-length, 32-bit instruction encoding scheme for all ARM
instructions. The basic encoding for all ARM7TDMI instructions is shown below. Individual
instruction descriptions and encodings are shown in section 4 of this document.

Table C.1 ARM Instruction Decoding table

Table C.2 ARM Instruction decoding table for Condition field

Condition Binary Hex


EQ(Equal) 0000 0
NE(Not Equal) 0001 1
CS/HS(Carry set/Unsigned higher or same) 0010 2
CC/LO(Carry clear/Unsigned lower) 0011 3
MI(Minus/Negative) 0100 4
PL(Plus/Positive or Zero) 0101 5
VS(Overflow) 0110 6
VC(No Over Flow) 0111 7
HI(Unsigned Higher) 1000 8
LS(Unsigned lower or Same) 1001 9
GE(Unsigned greater than or equal) 1010 A
LT(signed Less than) 1011 B
GT(Signed Greater than) 1100 C
LE(signed less than or Equal) 1101 D
AL(Always)unconditional 1110 E
Table C.3 ARM Instruction decoding table for operating mode

Mode Binary Hex


User mode(_usr) 10000 0x10
FIQ mode(_fiq) 10001 0x11
IRQ mode(_irq) 10010 0x12
Supervisory mode(_svc) 10011 0x13
Abort mode(_abort) 10111 Ox17
Undefined mode (_abt) 11011 0x1B
System mode 11111 0x1F

Table C.4 ARM Instruction decoding table for Shift operation

Shift action Rs Shift Shift Size


LSL # shift_size N/A 00 0 to 31
LSL Rs Rs 00 N/A
LSR #32 N/A 01 0
LSR #Shift size N/A 01 1 to 31
LSR Rs Rs 01 N/A
ASR #32 N/A 10 0
ASR # Shift size N/A 10 1 to 31
ASR Rs Rs 10 N/A
RRX N/A 11 0
ROR #Shift size N/A 11 1 to 31
ROR Rs Rs 11 N/A

Table C.5 THUMB Instruction decoding table

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Move shifted register 0 0 0 Op Offset Rs Rd
Add/subtract 0 0 0 1 1 1 Op Rn/offset3 Rs Rd
Move/compare/add 0 0 0 Op Rd Offset8
ALU operations 0 1 0 0 0 0 Op Rs Rd
Hi register operations/branch exchange 0 1 0 0 0 1 Op H1 H2 Rs/Hs Rd/Hd
PC-relative load 0 1 0 0 1 Rd Word8
Load/store with register offset 0 1 0 1 L B 0 R0 Rb Rd
Load/store sign-extended byte/half word 0 1 0 1 H S 1 R0 Rb Rd
Load/store with immediate offset 0 1 1 B L Offset5 Rb Rd
Load/store half word 1 0 0 0 L Offset5 Rb Rd
SP-relative load/store 1 0 0 0 L Rd Word8
Load address 1 0 0 1 SP Rd Word8
Add offset to stack pointer 1 0 1 1 0 0 0 0 S SWord7
Push/pop registers 1 0 1 1 L 1 0 R Rlist
Multiple load/store 1 1 0 0 L Rb Rlist
Conditional branch 1 1 0 1 Cond Softset8
Software Interrupt 1 1 0 1 1 1 1 Value8
Unconditional branch 1 1 1 0 0 Offset11
Long branch with link 1 1 1 1 H Offset
APPENDIX D
References
 ARM7TDMI Data Sheet.
 The Insider‘s Guide To The Philips ARM7-Based Microcontroller Published By
Hitex (UK) Ltd.

 AMBA™ Specification (Rev 2.0) By ARM.

 Real View® Compilation Tools Version 4.0Compiler User Guide by ARM.

 ARM Software Development Toolkit Version 2.50 Reference Guide by ARM.

 Exception and Interrupt Handling in ARM by Ahmed Fathy Mohammed


Abdelrazek.

 UM10139 LPC214x User manual by Philips.

You might also like