MCES_Unit_1_2_ARM-Instruction-set_2023
MCES_Unit_1_2_ARM-Instruction-set_2023
MCES_Unit_1_2_ARM-Instruction-set_2023
Introduction
Architecture
Programmers Model
Instruction Set
TM
1 1
History of ARM
• ARM (Acorn RISC Machine) started as a new, powerful, CPU design for the
replacement of the 8-bit 6502 in Acorn Computers (Cambridge, UK, 1985)
• First models had only a 26-bit program counter, limiting the memory space
to 64 MB (not too much by today standards, but a lot at that time).
• 1990 spin-off: ARM renamed Advanced RISC Machines
• ARM now focuses on Embedded CPU cores
• IP licensing: Almost every silicon manufacturer sells some microcontroller
with an ARM core. Some even compete with their own designs.
• Processing power with low current consumption
• Good MIPS/Watt figure
• Ideal for portable devices
• Compact memories: 16-bit opcodes (Thumb)
• New cores with added features
• Harvard architecture (ARM9, ARM11, Cortex)
• Floating point arithmetic
• Vector computing (VFP, NEON)
• Java language (Jazelle)
TM
2 2
Facts
• 32-bit CPU
• 3-operand instructions (typical): ADD Rd,Rn,Operand2
• RISC design…
• Few, simple, instructions
• Load/store architecture (instructions operate on registers, not memory)
• Large register set
• Pipelined execution
• … Although with some CISC touches…
• Multiplication and Load/Store Multiple are complex instructions (many cycles
longer than regular, RISC, instructions)
• … And some very specific details
• No stack. Link register instead
• PC as a regular register
• Conditional execution of all instructions
• Flags altered or not by data processing instructions (selectable)
• Concurrent shifts/rotations (at the same time of other processing)
• …
TM
3 3
TM
4 4
Agenda
Introduction
Architecture
Programmers Model
Instruction Set
TM
5 5
ARM7TDMI A[31:0]
Block Diagram
Address Register Address
Incrementer
PC bus
PC
REGISTER
BANK
ALU bus
Control Lines
INSTRUCCTION
DECODER
Multiplier
B bus
A bus
SHIFT
A.L.U.
Instruction Reg.
Thumb to
ARM
Write Data Reg. Read Data Reg.
translator
D[31:0]
ARM Pipelining examples
TM
7 7
ARM7TDMI Pipelining (I)
TM
8 8
ARM7TDMI Pipelining (II)
• More complex instructions:
Introduction
Architecture
Programmers Model
Instruction Set
TM
10 10
Data Sizes and Instruction Sets
TM
11 11
Processor Modes
TM
12 12
The Registers
TM
13 13
The ARM Register Set
cpsr
spsr spsr spsr spsr spsr spsr
TM
14 14
Special Registers
Special function registers:
PC (R15): Program Counter. Any instruction with PC as its destination register
is a program branch
SP (R13): Stack Pointer. There is no stack in the ARM architecture. Even so,
R13 is usually reserved as a pointer for the program-managed stack
CPSR : Current Program Status Register. Holds the visible status register
SPSR : Saved Program Status Register. Holds a copy of the previous status
register while executing exception or interrupt routines
- It is copied back to CPSR on the return from the exception or
interrupt
- No SPSR available in User or System modes
TM
15 15
Register Organization Summary
User,
FIQ IRQ SVC Undef Abort
SYS
r0
r1
User
r2 mode
r3 r0-r7,
r4 r15, User User User User
r5 and mode mode mode mode
cpsr r0-r12, r0-r12, r0-r12, r0-r12,
r6
r15, r15, r15, r15,
r7 and and and and
r8 r8 cpsr cpsr cpsr cpsr
r9 r9
r10 r10
r11 r11
r12 r12
r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp)
r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr)
r15 (pc)
cpsr
spsr spsr spsr spsr spsr
TM
16 16
Program Status Registers
31 28 27 24 23 16 15 8 7 6 5 4 0
N Z C V undefined I F T mode
f s x c
TM
17 17
Program Counter (R15)
TM
18 18
Exception Handling
TM
19 19
Agenda
Introduction
Architecture
Programmers Model
Instruction Set (for ARM state)
TM
20 20
Conditional Execution and Flags
TM
21 21
Condition Codes
TM
22 22
Advantages of conditional
execution
TM
23 23
Examples of conditional
execution
Use a sequence of several conditional instructions
if (a==0) func(1);
CMP r0,#0
MOVEQ r0,#1
BLEQ func
TM
24 24
Data processing Instructions
Consist of :
Arithmetic: ADD ADC SUB SBC RSB
RSC
Logical: AND ORR EOR BIC
Comparisons: CMP CMN TST TEQ
Data movement: MOV MVN
Immediate value
8 bit number, with a range of 0-255.
ALU Rotated right through even number of
positions
Allows increased range of 32-bit
constants to be loaded directly into
Result registers
TM
26 26
The Barrel Shifter
LSL : Logical Left Shift ASR: Arithmetic Right Shift
CF Destination 0 Destination CF
Destination CF
TM
27 27
Immediate constants (1)
11 8 7 0
rot immed_8
x2
Shifter
ROR
4 bit rotate value (0-15) is multiplied by two to give range 0-30 in steps of 2
Rule to remember is “8-bits shifted by an even number of bit positions”.
TM
28 28
Loading 32 bit constants
or
Generate a LDR instruction with a PC-relative address to read the constant
from a literal pool (Constant data area embedded in the code).
For example
LDR r0,=0xFF => MOV r0,#0xFF
LDR r0,=0x55555555 => LDR r0,[PC,#Imm12]
…
…
DCD 0x55555555
This is the recommended way of loading constants into a register
TM
29 29
Data processing instr. FLAGS
Operations are:
ADD operand1 + operand2
ADC operand1 + operand2 + carry
SUB operand1 - operand2
SBC operand1 - operand2 + carry -1
RSB operand2 - operand1
RSC operand2 - operand1 + carry - 1
Syntax:
<Operation>{<cond>}{S} Rd, Rn, Operand2
Examples
ADD r0, r1, r2
SUBGT r3, r3, #1
RSBLES r4, r5, #5
TM
31 31
Comparisons
TM
32 32
Logical Operations
Operations are:
AND operand1 AND operand2
EOR operand1 EOR operand2
ORR operand1 OR operand2
BIC operand1 AND NOT operand2 [ie bit clear]
Syntax:
<Operation>{<cond>}{S} Rd, Rn, Operand2
Examples:
AND r0, r1, r2
BICEQ r2, r3, #7
EORS r1,r3,r0
TM
33 33
Data Movement
Operations are:
MOV operand2
MVN NOT operand2
Note that these make no use of operand1.
Syntax:
<Operation>{<cond>}{S} Rd, Operand2
Examples:
MOV r0, r1
MOVS r2, #10
MVNEQ r1,#0
TM
34 34
Multiply
Syntax:
MUL{<cond>}{S} Rd, Rm, Rs Rd = Rm * Rs
MLA{<cond>}{S} Rd,Rm,Rs,Rn Rd = (Rm * Rs) + Rn
[U|S]MULL{<cond>}{S} RdLo, RdHi, Rm, Rs RdHi,RdLo := Rm*Rs
[U|S]MLAL{<cond>}{S} RdLo, RdHi, Rm, Rs RdHi,RdLo:=(Rm*Rs)
+RdHi,RdLo
Cycle time
Basic MUL instruction
2-5 cycles on ARM7TDMI
1-3 cycles on StrongARM/XScale
2 cycles on ARM9E/ARM102xE
+1 cycle for ARM9TDMI (over ARM7TDMI)
+1 cycle for accumulate (not on 9E though result delay is one cycle longer)
+1 cycle for “long”
Above are “general rules” - refer to the TRM for the core you are using
for the exact details
TM
35 35
Branch instructions
31 28 27 25 24 23 0
Cond 1 0 1 L Offset
The processor core shifts the offset field left by 2 positions, sign-extends
it and adds it to the PC
± 32 Mbyte range
How to perform longer branches or absolute address branches?
solution: LDR PC,…
TM
36 36
Single register data transfer
Syntax:
LDR{<cond>}{<size>} Rd, <address>
STR{<cond>}{<size>} <address>, Rn
e.g. LDREQB
TM
37 37
Address accessed
TM
38 38
Pre or Post Indexed Addressing?
Pre-indexed: STR r0,[r1,#12]
Offset r0
Source
12 0x20c 0x5 0x5 Register
for STR
r1
Base
Register 0x200 0x200
Base-update possible: r0 r1
LDM r10!,{r0-r6} r0
TM
40 40
Atomic data swap
TM
41 41