Arm Processor
Arm Processor
Arm Processor
ARM Architecture
Based on RISC architecture with
enhancements to meet requirements of
embedded applications
o A large uniform register file
o Load/store architecture
o Uniform and fixed length instructions
o 32 bit processor
o Good speed/power consumption ratio
o High code density
Load-store architecture
Instruction set will only process (add, subtract
and so on) values which are in registers and
place the results into a register
The operations which apply to memory state are
o the ones which copy memory values into
registers(load instructions)
o or copy register values into memory (store
instructions)
LOAD-STORE ARCHITECTURE
CONT…
ARM instructions fall into one of the
following categories
1. Data processing instructions(use and
change only register values)
2. Data transfer instructions(load and store
instructions)
3. Control flow instructions[branch instructions,
branch and link instructions(similar to
interrupt) or supervisor calls
ENHANCEMENTS TO BASIC RISC
FEATURES
Control over ALU and shifter for every data
processing operations to maximize there
usage
Auto-increment and auto-decrement
addressing modes to optimize program loops
Multiple Load/Store data elements to
maximize throughput
Conditional execution of instruction to
maximize throughput
ARM Architecture Versions
Version 1
o 26 bit addressing, no multiply or coprocessor
Version 2
o Includes 32 bit result multiply co-processor
Version 3
o 32 bit addressing
Version 4
o Add signed, unsigned half word and signed byte load and store
instructions
Version 4T
o 16 bit Thumb compressed form of instruction is introduced
ARM Architecture Versions cont..
Version 5T
o Superset of 4T adding new instruction
Version 5TE
o Add signal processing extension
Examples
o ARM6: v3
o ARM7: v3, ARM7TDMI:V4t
o StrongARM: v4
o ARM 9E-S:v5TE
ARM Architecture Versions cont..
ARM9TDMI
o T –Thumb 16 bit compressed instruction set
o D – on chip Debug request
o M – enhanced Multiplier(yields 64 bit result)
o I – Embedded ICE hardware to give on-chip
breakpoint and watchpoint support
Overview: Core Data Path
Data items are placed in register file
o No data processing instructions directly manipulate
data in memory
Instructions typically use two source registers
and single result or destination register
A Barrel shifter on the data path can pre-process
data before it enters ALU
Increment/Decrement logic can update register
content for sequential access independent of
ALU
Basic ARM Organisation
Registers
General purpose registers hold either data or
address
All registers are of 32 bits
In user mode 16 data registers and 2 status
registers are visible
Data registers: r0 to r15
o Three registers r13, r14 and r15 perform special
functions
o r13: stack pointer
o r14: link register (where return address is stored
whenever a subroutine is called)
o r15: program counter
Registers contd..
Depending upon context r13 and r14 can
also be used as GPR
Any instruction which use r0 can as well
be used with any other GPR(r1-r13)
In addition, there are two status registers
o CPSR: Current Program Status Register
o SPSR: Saved Program Status Register
Register(r15)
• When the processor is executing in ARM
state
o All instructions are 32 bit wide
o All instructions are word aligned
o PC value is stored in bits[31:2] with bits [1:0]
undefined
Program Status Registers
31 28 27 24 23 16 15 8 7 6 5 4 0
N Z C V U n d e f i n e d I F T mode
f s x c
• Condition code flags • Interrupt Disable bits.
– N = Negative result from – I = 1: Disables the IRQ.
ALU – F = 1: Disables the FIQ.
– Z = Zero result from ALU • T Bit
– C = ALU operation Carried – Architecture xT only
out – T = 0: Processor in ARM
– V = ALU operation state
oVerflowed – T = 1: Processor in Thumb
state
• Mode bits
– Specify the processor mode
Processor modes
Processor modes determine
o Which registers are active and
o Access rights to CPSR register itself
Each processor mode is either
o Privileged: full read-write access to the CPSR
o Non-privileged: only read access to the
control field of the CPSR but read-write
access to the condition flags
Processor modes contd..
ARM has seven modes
o Privileged: abort, fast interrupt request,
interrupt request, supervisor, system and
undefined
o Non-privileged: user
User mode is used for program and
applications
Privileged modes
Abort: when there is a failed attempt to
access memory
Fast Interrupt Request (FIQ) & interrupt
request: correspond to interrupt levels
available on ARM
Supervisor mode: state after reset and
generally the mode in which OS kernel
executes
Privileged modes contd..
System mode: special version of user
mode that allows full read-write access of
CPSR
Undefined: When processor encounters
an undefined instruction
Banked Registers
Register file contains in all 37 registers
o 20 registers are hidden from program at
different times. These registers are called
banked registers
o Banked registers are available only when the
processor is in a particular mode
• Processor modes (other than system mode) have
a set of associated banked registers that are
subset of 16 registers
• Maps one-to-one onto a user mode register
Register Banking
Current Visible Registers
Current Visible Registers User Registers replaced by
r0
User Mode
User r1
r0
r1
banked registers
r2
r2
r3 Banked out Registers
r4
r3 Banked out Registers
r4
r5
r5
r6
r6
FIQAbort FIQ
IRQ IRQ
SVC SVC
Undef Undef
Abort
r7
r7
r8 r8 r8
r8
r9 r9 r9
r9
r10 r10 r10
r10
r11 r11 r11
r11
r12 r12 r12
r12
r13 (sp) r13 (sp) r13r13
r13 (sp) (sp)
(sp) r13
r13 (sp)
(sp) r13
r13 (sp)
(sp) r13 r13
(sp)(sp)
r13 (sp)
CPSR r14 (lr) r14 (lr) r14r14
r14 (lr) (lr)
(lr) r14
r14 (lr)
(lr) r14
r14 (lr)
(lr) r14 r14
(lr)(lr)
r14 (lr)
copied r15 (pc)
r15 (pc)
into cpsr
SPSR spsr
cpsr spsr spsr
spsr spsr
spsr
spsr
spsr spsr
spsr
SPSR
• Each privileged mode (except system
mode) has associated with it a SPSR (
Stored Program Status Register)
• This SPSR is used to save the state of
CPSR (Current Program Status Register)
when the privileged mode is entered in
order that the user state can be fully
restored when the user process is
resumed
ARM Registers
System & User FIQ Supervisor Abort IRQ Undefined
R0 R0 R0 R0 R0 R0
R1 R1 R1 R1 R1 R1
R2 R2 R2 R2 R2 R2
R3 R3 R3 R3 R3 R3
R4 R4 R4 R4 R4 R4
R5 R5 R5 R5 R5 R5
R6 R6 R6 R6 R6 R6
R7 R7 R7 R7 R7 R7
R8 R8_fiq R8 R8 R8 R8
R9 R9_fiq R9 R9 R9 R9
R10 R10_fiq R10 R10 R10 R10
R11 R11_fiq R11 R11 R11 R11
R12 R12_fiq R12 R12 R12 R12
R13 R13_fiq R13_svc R13_abt R13_irq R13_und
R14 R14_fiq R14_svc R14_abt R14_irq R14_und
R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC)
CPSR CPSR CPSR CPSR CPSR CPSR
SPSR_fiq SPSR_svc SPSR_abt SPSR_irq SPSR_und
Mode changing
• Mode changes by writing directly to CPSR
or by hardware when the processor
responds to exception (any condition that
needs to halt the normal sequential
execution of instructions) or interrupt
• To return to user mode a special return
instruction is used that instructs the core to
restore the original CPSR and banked
registers
ARM Memory organisation
Can be configured as little endian or big
endian
Little endian
Big endian
Memory organization contd..
"Little Endian" means that the low-order byte of the
number is stored in memory at the lowest address, and
the high-order byte at the highest address. (The little end
comes first.) For example, a 4 byte Long Int Byte3 Byte2
Byte1 Byte0 will be arranged in memory as follows:
o Base Address+0 Byte0
o Base Address+1 Byte1
o Base Address+2 Byte2
o Base Address+3 Byte3
o MVN Rd, N
o Move into Rd not of the 32-bit value from
source
Using Barrel Shifter
Enables shifting 32 bit operand in one of
the source registers left or right by a
specific no. of positions within the cycle
time of instructions
Basic barrel shifter operations
o Shift left, shift right, rotate right
Facilitates fast multiply, division and
increases code density
Example : mov r7, r5, LSL #2
Arithmetic instructions
Implements 32 bit addition and subtraction
3 operand form
Examples
SUB r0, r1, r2
Subtract value stored in r2 from that of
r1 and store in r0
SUB r1, r1, #1
Subtract 1 from r1 and store result in r1
and update Z and C flags
With Barrel shifter
Use of barrel shifter with arithmetic and
logical instructions increases the set of
possible available operations
Example
o ADD r0, r1, r1 LSL #1
o Register r1 is shifted to the left by 1, then it is
added with r1 and the result (3 times of r1) is
stored in r0.
Multiply instructions
• Multiply contents of a pair of registers
Long multiply generates 64 bit result
Examples
MUL r0, r1, r2
UMULL r0, r1, r2, r3
Number of cycles taken for execution of
multiply instruction depends upon
processor implementation
Multiply and Accumulate
• Result of multiplication can be
accumulated with content of another
register
– MLA Rd, Rm, Rs, Rn
• Rd = (Rm * Rs) + Rn