Professional Documents
Culture Documents
Embedded.....Everywhere
Where are we ?
PiTechnologies
PiTechnologies
Track Agenda
PiTechnologies
Track Agenda
PiTechnologies
PiTechnologies
Course outline
PiTechnologies
1999
9999
9
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
10
Agenda
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
11
BASIC Concepts
Basic Concepts
Basic Concepts
Design Rules for RISC Philosophy
Instruction
reduced ,one cycle instruction
programmer synthesize complex operation (/)(*)
fixed length instructions To allow pipeline
Pipeline
Basic Concepts
Register
general purpose, can hold
address or data
acts as fast local memory for
data processing operation
Basic Concepts
Load-Store Architecture
Processor operate on data stored on registers
Load and store instructions transfer data between register bank
and external memory
Saving the cost of memory access by separating memory access
from data processing (use data multiple time)
Basic Concepts
Basic Concepts
Compiler
CISC
RISC
More complex
Hardware
Less complex
Hardware
Basic Concepts
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
20
What is ARM ?
PiTechnologies
21
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
22
Why ARM ?
PiTechnologies
23
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
24
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
26
ARM Application
ARM Application
ARM Applications
ARM Processor
Family: Cortex-A
Series
ARM
Processor: CortexA8
Silicon Supplier:
Samsung
ARM Processor
Family: ARM11
Silicon Supplier
Freescale i.MX353
applications
processor
ARM Processor
Family: Cortex-A Series
ARM Processor: CortexA8
Silicon Supplier:
Qualcomm QSD 8650
@ 1 GHz
ARM
Processor
Family: ARM9
ARM Application
ARM Application
TEGRA CHIP
ARM Application
Dual-Core ARM Cortex A9 CPU:
NVIDIA Tegra features the worlds first dual core CPU for
mobile applications in addition to support for Symmetric
Multi-Processing (SMP) which enable :
ARM Application
TEGRA Powered Devices
Tablets
ARM Applications
The Snapdragon application
processor core, is Qualcomm's
own design.
Based on ARM Cortex-A8 core
and ARM v7 instruction set, but
theoretically has much higher
performance for multimediarelated SIMD operations.
All Snapdragon processors
contain the circuitry to
decode high-definition
video (HD) resolution
ARM Applications
The Samsung Hummingbird is
a system-on-a-chip (SoC) designed
for mobile devices, which is based
on the 45nm ARM Cortex
A8 architecture with
a PowerVR SGX540 GPU.
One advantage of the Hummingbird
SoC is the high performance with
low power consumption.
The chip was first used in
the Samsung Galaxy S, followed
by the Samsung Wave,
the Samsung Galaxy Tab,
the Samsung GT-I9020T (Google
Nexus S),
ARM Applications
ARM Applications
PiTechnologies
41
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
42
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
45
ARM ISA
Instruction Set
32-bit
Thumb
instruction length
Instruction Set
16-bit
instruction length
It improves code density
Compressed version of ARM Instruction Set
Jazelle
PiTechnologies
Java
byte codes
46
Allows faster operation for JME mobile applications
Conclusion
ARM feature
power consumption
High Code Density
Low Cost Memory
Internal Debugging Capabilities
ARM Instruction Set
Variable Cycle Execution
Inline barrel Shifter
Thumb 16 bit inst.
Conditional execution
Enhanced instruction
PiTechnologies
47
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
48
consumption
Price
Memory characteristics:
Memory
hierarchy
Memory types
Memory width
Memory width:
PiTechnologies
52
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
53
PiTechnologies
54
Agenda
Basic Concepts
What is ARM ?
Why ARM?
ARM Application
ARM ISA
PiTechnologies
ARM Technology
56
PiTechnologies
57
Its Clear that barrel shifter and ALU can calculate wide
range of expressions and addresses
Registers
All registers are 32-bits wide and
can hold either data or an address
There are up to 18 active register 16 for datar0 to r15 - visible to all programmer and 2 for
processor status
Registers
Depending on the context r13 and r14 may be used
as general purpose and banked during processor
mode change
But its dangerous to use r13 as general register as its
hold as valid stack point
Registers ..contd
CPSR : Current Program Status Register
32-bit register holds current processor's status
To monitor and control internal operations
Some processor have J-bit in flags for Jazelle
Processor Modes
Processor Mode
Processor Modes
ARM Processor Runs in one of 7 Modes:
Processor
mode
User
FIQ
IRQ
Supervisor
Abort
Abb. Description
Undefined
und
System
sys
usr
fiq
irq
sve
abt
Processor Mode
Abort mode when attempt to access not exist memory location
Fast int. & int. mode represents two level of interrupts
Supervisor Mode the mode processor in it in starting up and
after reset and when operating system want to make its
initialization and kernel operate on it
System Mode is special version of user mode but have full read
and write access to CPSR
User Mode for program and applications
Undefined Mode for unknown instructions and Coprocessor
Processor Modes
Modes other than user mode are called Privileged
mode
FIQ, RIQ, Supervisor, Abort and Undefined are called
Exception mode
The following triggers change the mode of the
processor:
Software control (by operating system)
External interrupt (IRQ or FIQ)
Processor exceptions (data abort , prefetch abort
,undefined inst.,)
Registers
Registers
The previous figure illustrate 37 register in register file
There is 20 register hidden from program at different times
This shading register is available when processor in particular
mode
Any processor mode can change its mode by write to CPSR
except user mode
CPSR is saved is SPSR when mode changed (interrupt)
Pipeline
The average rate of instruction execution per processor cycle is
called Instruction Throughput
To speed up the execution, RISC processors fetch the next
instruction while executing the current instruction, which is known
as Pipeline Mechanism
Basic RISC pipeline stages are:
Fetch: load the instruction to be executed from memory
Decode: identify the instruction to be executed
Pipeline
An examples of three instruction sequence execution:
Pipeline
Core Extensions
Core extensions are a set of components reside close to the
ARM core to do extra functionality
Its standard components placed next to ARM core
Its improve performance ,mange resources, designed to
provide flexibility in handling particular application
Each ARM family has its own set of extensions available
There is three hardware extensions exist around arm core
depending on ARM family
Core Extensions
The extensions is
Cache and Tightly Coupled Memory
Memory Management
Co-processor Interface
UNDEF: Occurs when an instruction can not be decoded or processor cannot decode
instruction
SWI: Occurs when SWI instruction is executed (used used as the mechanism to
invoke OS routine)
PABT: occurs when the processor attempts to fetch an instruction from an address
without the correct access permissions. The actual abort occurs in the decode stage.
response
Lets Start .
Lets Start .
101
ARM Families
ARM Instruction Set
ARM Families
ARM Families
ARM has designed a number of processors that are grouped into different families
according to the core they use.
The families are based on the ARM7, ARM9, ARM10, and ARM11cores
ARM 7 FAMILY
Von-neuman style
Three stage pipeline
Execute ARM v4T instruction set
Example : ARM7TDMI
Very popular core
used in most embedded 32 bit applications
provide very good performance to power ratio
ARM Families
ARM 9 Family
announced in 1997
Because of its five-stage pipeline, the ARM9processor can run at higher clock
frequencies than the ARM7 family. The extra stages
improve the overall performance of the processor.
The memory system has been redesigned to follow the Harvard architecture
The first processor in the ARM9 family was the ARM920T, which includes a separate D + I
cache and an MMU. This processor can be used by operating systems requiring virtual
memory support.
The ARM940T includes a smaller D +I cache and an MPU. The ARM940T is designed for
applications that do not require a platform operating system.
Both ARM920T and ARM940T execute the architecture v4T instructions.
There are two variations: the ARM946E-S and the ARM966E-S. Both execute architecture
v5TE instructions. They also support the optional embedded trace macrocell (ETM), which
allows a developer to trace instruction and data execution in real time on the processor.
This is important when debugging applications with time-critical segments.
ARM Families
The ARM946E-S includes TCM, cache, and an MPU.
The sizes of the TCM and caches are configurable. T
his processor is designed for use in embedded control applications that
require deterministic real-time response. I
n contrast, the ARM966E does not have theMPU and cache extensions
but does have configurable TCMs.
The latest core in the ARM9 product line is the ARM926EJ-S
synthesizable processor core, announced in 2000. It is designed for use
in small portable Java-enabled devices such as 3G phones and personal
digital assistants (PDAs).
The ARM926EJ-S is the first ARM processor core to include the Jazelle
technology, which accelerates Java bytecode execution.
ARM Families
ARM10 Family
announced in 1999, was designed for performance. It extends the ARM9
pipeline to six stages. It also supports an optional vector floating-point (VFP)
unit, which adds a seventh stage to the ARM10 pipeline. The VFP significantly
increases floating-point performance and is compliant with the IEEE
754.1985 floating-point standard.
The ARM1020E is the first processor to use an ARM10E core. Like the
ARM9E, it includes the enhanced E instructions. It has separate 32K D + I
caches, optional vector floating-point unit, and an MMU. The ARM1020E also
has a dual 64-bit bus interface for increased performance.
ARM1026EJ-S is very similar to the ARM926EJ-S but with both MPU and
MMU. This processor has the performance of the ARM10 with the flexibility
of an ARM926EJ-S
ARM Families
Specialized Processors
StrongARM was originally co-developed by Digital Semiconductor and is now exclusively
Questions?
Mahmoud S.Khalifa
msmahmoud@PiTechnologies.net
Web: www.PiTechnologies.net
Facebook Page : PiTechnologies. Page
Twitter:PiTechnologiess