Module - 1 - 2 - ESD - 2023 PDF
Module - 1 - 2 - ESD - 2023 PDF
Module - 1 - 2 - ESD - 2023 PDF
Module I
ARM is an abbreviation for Advanced RISC Machine. Originally it was called Acorn RISC
Machine. ARM was formed in 1990 as Advanced RISC Machines Ltd., a joint venture of Apple
Computer, Acorn Computer Group, and VLSI Technology. In 1991, ARM introduced the ARM6
processor family; It belongs to a family, Reduced Instruction Set Computing architectures
(RISC). ARMV3 to ARMV7 supports 32-bit instruction set. Then came thumb2 instruction set
processor which supports 16/32-bit instructions.
ARM is one of the most widely used processor cores some of the salient applications are
ARM7 was used in iPod, ARM9 in BenQ, ARM11 in Sony Ericsson, Nokia
The processor can work in either ARM mode (32-bit) or thumb mode (16-bit). Being in
one mode it cannot execute the instruction of other mode. The mode switching was
consuming more time so the ARM Company developed ARM Cortex™-M3 which is
Thumb2 based architecture.
The ARM Cortex™-M3 processor, the first of the Cortex generation of processors
released by ARM in 2006, was primarily designed to target the 32-bit microcontroller
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
market. The Cortex-M3 processor provides excellent performance at low gate count
and comes with many new features previously available only in high-end processors.
Cortex-M3 processor-based microcontrollers can be easily programmed using the C language
and are based on a well-established architecture, application code can be ported and reused
easily, reducing development time and testing costs.
The Cortex-M3 addresses the requirements for the 32-bit embedded processor market in the following
ways:
Excellent performance efficiency: With thumb2 architecture the switching between ARM and
Thumb mode is avoided. So the processing speed is increased without increasing the frequency or
power requirements
Low power consumption: Cortex M3 follows RISC architecture hence consumes less power
enabling longer battery life, especially critical in hand held and portable products including
wireless networking applications
Enhanced determinism: Cortex M3 has an efficient interrupt architecture and guaranteeing that
critical tasks are serviced as quickly as possible and in a known number of cycles
Improved code density: It has better code density because the program can have a mixture of
both 16-bit and 32-bit instruction ensuring that code fits in even the smallest memory footprints
Ease of use: Code can easily be migrated from 8-bit/16-bit to 32bit.
Lower cost solutions: Owing to increased market demand and huge applications involving
cortex M3 the cost of 32-bit-based system has been reduced to that of 8-bit and 16-bit devices at
around US $1.
Wide choice of development tools: with rich set of software emulator/ simulator (mostly open
source) it is easier to program and debug the code. Also there are full-featured development suites
from many development tool vendors
Wide choice of vendors: Philips, Texas instrument, Atmel
• The R profile is designed for high-end embedded systems for real-time application
• The M profile is designed for deeply embedded microcontroller-type systems.
1.2.4 Cortex M3 based microcontroller unit: Vendors like Philips purchase the cortex
M3 CPU and integrate with peripherals like memory, ADC, DAC, communication protocol
Units, debug architecture and integrate everything on single chip to produce the cortex M3 based
microcontroller ready for any application.
Developed by ARM
Developed by
Internal Bus
Chip
manufacturer
Peripherals Memory
Cortex M3 is a 32 bit microcontroller, with address and data bus 32-bit each. All registers are 32-
bit. The memory is Harvard architecture with total capacity of 4GB and separate buses connect
code and memory regions. There are separate address and data bus connecting the processor and
memory. This allows instruction and data access to take place at the same time improving the
performance of the system. With the help of MPU additional cache memory can be interfaced to
ARM processor. Cortex M3 supports both little endian and big endian.
CortexM3 has three stages pipeline architecture. The three stages of pipeline are Instruction fetch
Unit to read the instruction from code memory and decode unit to decode the instruction and
finally the execution unit which is made up of 32-bit ALU and general purpose registers. The 32-
bit ALU can perform 8/16/32-bit arithmetic and logical operations. ALU is supported by general
purpose and special purpose registers.
31 0
R0 General Purpose
R1 General Purpose
R2 General Purpose
R3 General Purpose
R4 General Purpose
R5 General Purpose
R6 General Purpose
R7 General Purpose
R8 General Purpose
R9 General Purpose
R10 General Purpose
R11 General Purpose
R12 General Purpose
R13 Stack Pointer
R14 Link Register
R15 Program Counter
i. PRIMASK: It is a 1-bit register. If this bit is set then except NMI and Hard Fault all
interrupts are masked. Its default value is ‘0’ meaning no interrupt is masked.
ii. FAULTMASK: It is a 1-bit register. If this bit is set except NMI all interrupt are
masked including Fault exceptions. Its default value is ‘0’ meaning no interrupt is
masked. This is used by operating system to handle hanged or crashed taks. When a
task crash it may cause some fault exception, during this time OS enables the
FAULTMASK bit which disables all faults and meanwhile OS can clean up and
recover from the crashed task. By disabling this bit OS gets time to recover from the
crash.
iii. BASEPRI: It is an 8-bit register. By programming a priority level into this register all
interrupts with priority lower than this level are masked and interrupts with higher
priority level can continue execution.
Note: For cortex M3 interrupt with smaller priority number has higher priority value and
interrupt with larger priority number has lower priority value.
These special purpose registers can be accessed with special instructions MSR and MRS.
2. The control register: It is used to define privilege level and stack pointer selection. It
has 2 bits.
If ARM processor is working in the thread mode then alternate stack pointer (PSP) can be used
and in handler mode it has to be MSP only so here control [1] must be zero.
Operating modes of ARM processor: Cortex M3 supports two operating modes, thread mode
(User application program like main) and handler mode (ISR) and two privilege levels, user and
privilege levels. The processor when executing thread mode can be in both user and privilege
levels, whereas in handler mode processor will be only in privilege mode. By default on reset the
processor is in thread mode with privilege access.
Privilege User
When running a exception handler Handler Mode
Control[1]=0
When executing a user program Thread mode Thread mode
Control[0]=0 Control[0]=1
Privelege
Handler
Exception
Exception
Start(reset) Exception
Exit
Exception
Privelege User
Thread Thread
Control [0] 1
0
Exceptions are generally internally generated interrupt (like faults, WDT and access violations).
Interrupts are generated by external source. Number of interrupts depends on the device. It can
be typically 16-32. Interrupts are of type vectored interrupt.
The Cortex-M3 processor includes an interrupt controller called the Nested Vectored Interrupt
Controller
(NVIC). It is closely coupled to the processor core and provides a number of features as follows:
• Interrupt masking
The NVIC provides nested interrupt support. All the external interrupts and most of the system
exceptions can be programmed to different priority levels. When an interrupt occurs, the NVIC
compares the priority of this interrupt to the current running priority level. If the priority of the
new interrupt is higher than the current level, the interrupt handler of the new interrupt will
override the current running task.
The Cortex-M3 processor has vectored interrupt support. When an interrupt is accepted, the
starting address of the interrupt service routine (ISR) is located from a vector table in memory.
There is no need to use software to determine and branch to the starting address of the ISR. Thus,
it takes less time to process the interrupt request.
The Cortex-M3 processor also includes a number of advanced features to lower the interrupt
latency. These include automatic saving and restoring some register contents, reducing delay in
switching from one ISR to another, and handling of late arrival interrupts.
Interrupt Masking
Interrupts and system exceptions can be masked based on their priority level or masked
completely using the interrupt masking registers BASEPRI, PRIMASK, and FAULTMASK.
They can be used ensure that time-critical tasks can be finished on time without being
interrupted.
1.8 Cortex M3 memory: The Cortex M3 has 4GB memory with Harvard architecture. The
memory is divided into code region, SRAM, Peripherals, External RAM (optional), External
devices (optional) and System level memory.
0xFFFFFFFF
System Level
0xE0000000
0xDFFFFFFF
External devices
0xA0000000
0x9FFFFFFF
External RAM
0x60000000
0x5FFFFFFF
Peripherals
0x40000000
0x3FFFFFFF
SRAM
0x20000000
0x1FFFFFFF
CODE
0x00000000
0
ARM processor is connected to memory with a special bus called AMBA bus. The bus
connection is Harvard (multiple bus with different speed) for parallel access of the memory. All
bus widths are 32-bit.
Code memory (0.5GB): Used to store the program and data. It is connected by code bus which is
made up of two buses, I-Code bus to read the code (instruction) and D-Code to read data. The
Interrupt Vector Table is stored in stored in code memory.
SRAM (0.5GB): Used for data storage. SRAM region is connected by system bus to read and
write from this memory.
Peripherals (0.5GB): All on chip peripherals registers are mapped here. Peripheral memory is
connected by system bus to read and write from this memory.
External RAM (1GB): An external memory of 4MB can be connected to Cortex M3. The address
of this external RAM lies in the range 0x60000000- 0x9FFFFFFF. External RAM memory is
connected by system bus to read and write from this memory.
External Peripheral (1GB): The registers of external peripherals are mapped in the region
0xA0000000 – 0xDFFFFFFF. External Peripheral memory is connected by system bus to read
and write from this memory.
System Level (0.5GB): On chip peripherals like Memory Protection Unit (MPU) and Debug Unit
registers are mapped. Nested Vector Interrupt Controller (NVIC) registers are mapped here.
System level memory is connected by system bus to read and write from this memory. It is also
connected by low speed peripheral bus to interact with debug components.
Stack is a data structure that follows last in first out. It is used to store return address during a
subroutinue call also used to pass parameters. Cortex M3 has one stack pointer R13 but supports
two stack pointers (MSP- Main Stack Pointer and PSP- Process Stack Pointer) so physically
there can be two stacks. MSP also called sp_main is used in handler mode and PSP also called
sp_process. At a time user can work on one stack.
Push and Pop are the two instructions that work on stack. Push place a 32 bit data on top of stack
and pop retrieves the topmost element from stack.
Push {R6} ;
POP {R4}; pops top most element from stack and adds 4 to SP.
POP {R0, R2 R4} ;pops topmost 4 elements of stack to R0,R2,R3 and R4.
The conflict of two stack model of ARM cortex M3 is resolved by control[1] bit , if this bit is 0
then MSP is used for both thread and handler mode. If control[1]=1 then MSP is used in handler
mode and PSP in thread mode.
When cortex M3 is reset, PC is loaded with 0x00000000 i.e it is pointing to the first location of
the code memory. At address 0x00000000 SP value (R13) is stored. So user should load the
starting address of main stack at address 0x00000000. At the next word address, 0x00000004
Reset vector is stored, and unconditional branch instruction can be loaded here to take the control
to the actual program.
1.11 Memory Protection Unit (MPU): Cortex M3 has optional MPU to protect the OS related
routines from untrusted user programs. Sections of the memory are password protected and can
be accessed only in handler mode.
1.12 Instruction set of ARM processor: Cortex M3 uses Tumb2 instruction set which are a
combination of 16- and 32 bit instructions. There are specialized instruction for multiply , WFE,
WFI etc.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Module II
This module covers the addressing modes of ARM and instruction set, Some useful instructions.
Detailed memory structure, followed by assembly and C coding with introduction to CMSIS.
Label is optional , opcode specifies the mnemonics like ADD, BX. Each instruction can have
one, two or three operands. First operand is destination and remaining operands are source.
Comments are added to the code to make it more readable, and comments start with ‘;’.
For ARM processor most of the instructions are conditional, like ADDGT implies add if
previously greater than condition is met. But the condition is optional.
By appending S to the instructions the status register will be updated after the execution of the
instruction.
Eg ADDS R0,R1,R2 ;Add the contents of R2 with R1 and store the result in R0. Update
flags
Immediate data (upto 32-bit) is prefixed with # symbol and can be in decimal (default) or
hexadecimal (prefixed with 0x or &). Data can be given in ASCII format also. But in memory all
data will be stored in hexadecimal base.
MOV R0,#1234567h
MOV R1,’D’
To initialize a pointer we can use “equ” assembler directive or assign directly using ‘=’ symbol
Addr1 equ 0x20000000
MOV R0,=Addr1
MOV R1, =0x20000000
1. Define Constant Byte (DCB) : Used to define byte data like single or array of characters
2. Define Constant Data (DCD) : Used to define word data of 32 bit.
…..
3. Define Constant Instruction (DCI): Used to encode the instruction when assembler fails to
convert the instruction to its opcode.
15 0
R0 General Purpose
R1 General Purpose
R2 General Purpose
R3 General Purpose
R4 General Purpose
R5 General Purpose
R6 General Purpose
R7 General Purpose
R8 General Purpose
R9 General Purpose
R10 General Purpose
R11 General Purpose
R12 General Purpose
R13 Stack Pointer
R14 Link Register
R15 Program Counter
There are different ways to specify the address of the operands for any given operations such as
load, add or branch. The different ways of determining the address of the operands are called
addressing modes. The different addressing modes of ARM processor are explained below
1. Register Addressing
Operand is given in the CPU register of ARM ( R0-R15). It can be only source, only destination
or both.
Examples Meaning
------------------------------------------------------------------------
------------------------------------------------------------------------
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
ADD R1, R2, R3 ;Add R3 and R2 register content and store result in R1
2. Immediate Addressing
Operand is directly given in the instruction, prefixed with # symbol. It can be provided in
decimal or hexadecimal. Internally immedate always stored in hexadecimal system.
Examples Meaning
------------------------------------------------------------------------
CMP R0, #22 ; Compare R0 content with immediate value 22
------------------------------------------------------------------------
ADD R1, R2, #18 ; Add immediate value 18 to R2 and store result in R1
-----------------------------------------------------------------------
MOV R1, #0xFF ; Copy immediate FFh to R1
------------------------------------------------------------------------
AND R0, R1, #0xFF000000 ; logically AND immediate FF000000h with R1 and store
; result in R0
------------------------------------------------------------------------
CMNS R0, #6400 ; Compare negate immediate 6400 with R0 and update
; the N, Z, C and V flags
------------------------------------------------------------------------
CMPGT SP, R7, LSL #2 ; Compare R7 data shifted left 4 times with SP if
; GT condition is met and update the N, Z, C and V flags
------------------------------------------------------------------------
Register indirect addressing means that the location of an operand is held in a register. It is also
called indexed addressing or base addressing.
Register indirect addressing mode requires three read operations to access an operand. It is very
important because the content of the register containing the pointer to the operand can be
modified at runtime. Therefore, the address is a variable that allows the access to the data
structure like arrays.
[If R0 is pointing to location 0x10000000 then content (32-bit) of location 0x10000000 is copied
to R2.]
This is used to read sequential data from structures such as arrays, tables and vectors. A pointer
register is used to hold the base address. An offset can be added to achieve the effective address.
For example,
This is similar to the above, but it first accesses the operand at the location pointed by the base
register, then increments the base register. For example,
Register R15 is the program counter. If you use R15 as a pointer register to access operand, the
resulting addressing mode is called PC relative addressing. The operand is specified with respect
to the current code location. Please look at this example,
Note :
Most of the instruction in ARM are conditional {cond} ( EQ, NE, GT etc.).
{cond} condition is optional
In the syntax of ARM instructions {} means optional
{type} B(byte) ,SB(signed byte).H(half word),SH(signed halfword), D(double word).
Default is word. It is optional.
S suffix means update flags.
1. ADR
Generate PC-relative address.
Syntax
ADR{cond} Rd, label
where:
cond Is an condition code.
Rd Specifies the destination register.
Label is a PC-relative expression.
Operation
ADR generates an address by adding an immediate value to the PC, and writes the result to the
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
destination register. ADR like unconditional jump instruction provides the means by which
position-independent code can be generated, because the address is PC-relative.
If ADR is used to generate a target address for a BX or BLX instruction, then bit[0] of the
address generated must be set to 1 for correct execution. The range for label must be within the
range of −4095 to +4095 from the address in the PC.
Restrictions
Rd must not be SP and PC.
Condition flags
This instruction does not change the flags.
Examples
Examples
ADR R1,SUM ; Write address value of a location labelled as SUM to R1
Load and Store with immediate offset, pre-indexed immediate offset, or post-indexed
immediate offset.
Syntax
where:
op Is one of:
LDR Load Register.
STR Store Register.
Operation
LDR instructions load one or two registers with a value from memory.
STR instructions store one or two register values to memory.
The ranges of offset for immediate, pre-indexed and post-indexed forms are shown below
Offset ranges
Instruction type Immediate offset Pre-indexed Post-indexed
W, HW, SHW, B or SB −255 to 4095 −255 to 255 −255 to 255
Two words multiple of 4 in the multiple of 4 in the multiple of 4 in the
range −1020 to 1020 range −1020 to 1020 range −1020 to 1020
Condition flags
These instructions do not change the flags.
Examples
LDR R8, [R10] ; Loads R8 from the address in R10.
LDRNE R2, [R5, #960]! ; Loads (conditionally, on previous NE contion) R2 from a word
;960 bytes above the
;address in R5, and; increments R5 by 960
STR R2, [R9,#10] ; Store the content of R2 to effective address R9+10
STRH R3, [R4], #4 ; Store R3 as halfword (H-lower 16 bits) data into address in R4, then
;increment R4 by 4
LDRD R8, R9, [R3, #0x20] ; Load R8 from a word 8 words above the address in R3, and load
;R9 from a word 9 words above the address in R3
STRD R0, R1, [R8], #-16 ; Store R0 to address in R8, and store R1 to a word 4 bytes above
;the address in R8, and then decrement R8 by 16.
op Is one of:
LDR Load Register.
STR Store Register.
type cond same as previous instruction
Rt Specifies the register to load or store.
Rn Specifies the register on which the memory address is based.
Rm Specifies the register containing a value to be used as the offset.
LSL #n Is an optional shift, with n in the range 0 to 3.
Operation
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
The memory address to load from or store to is at an offset from the register Rn. The offset is
specified by the register Rm and can be shifted left by up to 3 bits using LSL.
The value to load or store can be a byte, halfword, or word. For load instructions, bytes and
halfwords can either be signed or unsigned.
Examples
STR R0, [R5, R1] ; Store value of R0 into an address equal to sum of R5 and R1
LDRSB R0, [R5, R1, LSL #1] ; Read byte value from an address equal to sum of R5 and
;two times R1, sign extended it to word and put it in R0
STR R0, [R1, R2, LSL #2] ; Stores value in R0 to an address equal to sum of R1and four times
;the value in R2.
Operation
LDR loads a register with a value from a PC-relative memory address. The memory address is
specified by a label or by an offset from the PC.
The value to load or store can be a byte, halfword or word. For load instructions, bytes and
halfwords can either be signed or unsigned.
label must be within a limited range of the current instruction. Shown below are the possible
offsets between label and the PC.
Examples
LDR R0, LookUpTable ; Load R0 with a word of data from an address labelled as
;LookUpTable
LDRSB R7, localdata ; Load a byte value from an address labelled
; as localdata, sign extend it to a wordvalue, and put it in R7.
Incorrect examples
address and the highest numbered register using the highest memory address. Here SP is
incremented by 4 after the data is removed from the stack.
On completion, PUSH and POP updates the SP register to point to the new top of stack,
Examples
PUSH {R0,R4-R7} ; Push R0,R4,R5,R6,R7 onto the stack
PUSH {R2,LR} ; Push R2 and the link-register onto the stack
POP {R0,R6,PC} ; Pop R0,R6 and PC from the stack, then branch to the new PC.
Operation
The ADD instruction adds the value of Operand2 or imm12 to the value in Rn.
The ADC instruction adds the values in Rn and Operand2, together with the carry flag.
The SUB instruction subtracts the value of Operand2 or imm12 from the value in Rn.
The SBC instruction subtracts the value of Operand2 from the value in Rn. If the carry flag is
clear,the result is reduced by one.
The RSB instruction subtracts the value in Rn from the value of Operand2. This is useful
because of the wide range of options for Operand2.
Use ADC and SBC to synthesize multiword arithmetic
In all cases result is stored in Rd.
Condition flags
If S is specified, these instructions update the N, Z, C and V flags according to the result.
Examples
ADD R2, R1, R3 ;Adds R3 content to R1 and store result in R2.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
SUBS R8, R6, #240 ; Subtracts 240 from contents of R6 store result in R8 and update the flags
; based on the result
RSB R4, R4, #1280 ; Subtracts contents of R4 from 1280 and store result in R4.
ADCHI R11, R0, R3 ; Add with carry R0 and R3 and store the result in R11.It is executed if C
;flag set and Z flag clear (HI).
Example below subtracts a 96-bit integer contained in R9, R1, and R11(MSB) from another
contained in R6, R2, and R8 (MSB). The example stores the result in R6, R9, and R2(MSB).
Operation
The AND, EOR, and ORR instructions perform bitwise AND, Exclusive OR, and OR operations
on the values in Rn and Operand2.
The BIC instruction performs an AND operation on the bits in Rn with the complements of the
corresponding bits in the value of Operand2.
The ORN instruction performs an OR operation on the bits in Rn with the complements of the
corresponding bits in the value of Operand2.
Restrictions
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Examples
Arithmetic Shift Right, Logical Shift Left, Logical Shift Right, Rotate Right and Rotate Right
with Extend.
Syntax
op{S}{cond} Rd, Rm, Rs
op{S}{cond} Rd, Rm, #n
RRX{S}{cond} Rd, Rm
where:
op Is one of:
S Is an optional suffix. If S is specified, the condition code flags are updated on the result of the
operation.
Rd Specifies the destination register.
Rm Specifies the register holding the value to be shifted.
Rs Specifies the register holding the shift length to apply to the value in Rm. Only the least
significant byte is used and can be in the range 0 to 255.
n Specifies the shift length. The range of shift length depends on the instruction:
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Operation
ASR, LSL, LSR, and ROR move the bits in the register Rm to the left or right by the number of
places specified by constant n or register Rs.
RRX moves the bits in register Rm to the right by 1. Retains the sign after rotating.
In all these instructions, the result is written to Rd, but the value in register Rm remains
unchanged.
Restrictions
Do not use SP and do not use PC.
Condition flags
If S is specified:
• these instructions update the N and Z flags according to the result
• the C flag is updated to the last bit shifted out, except when the shift length is 0.
Examples
ASR R7, R8, #9 ; Arithmetic shift right by 9 bits
LSLS R1, R2, #3 ; Logical shift left by 3 bits with flag update
LSR R4, R5, #6 ; Logical shift right by 6 bits
ROR R4, R5, R6 ; Rotate right the value in R5 by the least significant byte times the data in
; R6
RRX R4, R5 ; Rotate right with extend.
4. CLZ
Count Leading Zeros.
Syntax
CLZ{cond} Rd, Rm
.
Operation
The CLZ instruction counts the number of leading zeros in the value in Rm and returns the result
in Rd. The result value is 32 if no bits are set and zero if bit[31] is set.
Restrictions
Do not use SP and do not use PC.
Condition flags
This instruction does not change the flags.
Examples
CLZ R4,R9 ;counts leading zeroes in R9 and stores in R4. If R9=0x456789 , AE R4=9
CLZNE R2,R3
Operation
These instructions compare the value in a register with Operand2. They update the condition
flags on the result, but do not write the result to a register.
The CMP instruction subtracts the value of Operand2 from the value in Rn. This is the same as a
SUBS instruction, except that the result is discarded.
The CMN instruction adds the value of Operand2 to the value in Rn. This is the same as an
ADDS instruction, except that the result is discarded.
Restrictions
In these instructions:
• do not use PC
• Operand2 must not be SP.
Condition flags
These instructions update the N, Z, C and V flags according to the result.
Examples
CMP R2, R9 ; R2-R9 is performed and flags affected. R2, R9 retain old value.
CMN R0, #6400 ; R0+6400 is performed and flags affected. R0 retains old value.
CMPGT SP, R7, LSL #2
Examples
MOV R3, #0x4523 ; Write 0x4523 to R3, lower halfword and APSR are unchanged.
MOV R7,R8 ;copies 32 bit number from R8 to R7
Operation
MOVT writes a 16-bit immediate value, imm16, to the top halfword, Rd[31:16], of its
destination register. The write does not affect Rd[15:0].
With MOV, MOVT instruction pair any 32-bit constant can be loaded to a register.
Restrictions
Rd must not be SP and must not be PC.
Condition flags
This instruction does not change the flags.
Examples
MOVT R3, #0xF123 ; Write 0xF123 to upper halfword of R3, lower halfword
; and APSR are unchanged.
8. REV, REV16, REVSH, and RBIT Reverse bytes and Reverse bits.
where:
op Is any of:
Operation
Use these instructions to change endianness of data:
REV Converts 32-bit big-endian data into little-endian data or 32-bit little-endian data into big-
endian data.
REV16 Converts 16-bit big-endian data into little-endian data or 16-bit little-endian data into
big-endian data.
REVSH Converts either:
• 16-bit signed big-endian data into 32-bit signed little-endian data
• 16-bit signed little-endian data into 32-bit signed big-endian data.
Restrictions
Do not use SP and do not use PC
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
.
Condition flags
These instructions do not change the flags.
Examples
Operation
The TST instruction performs a bitwise AND operation on the value in Rn and the value of
Operand2. This is the same as the ANDS instruction, except that it discards the result.
To test whether a bit of Rn is 0 or 1, use the TST instruction with an Operand2 constant that has
bit set to 1 and all other bits cleared to 0.
The TEQ instruction performs a bitwise Exclusive OR operation on the value in Rn and the
value of Operand2. This is the same as the EORS instruction, except that it discards the result.
Use the TEQ instruction to test if two values are equal without affecting the V or C flags.
TEQ is also useful for testing the sign of a value. After the comparison, the N flag is the logical
Exclusive OR of the sign bits of the two operands.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Restrictions
Do not use SP and do not use PC
Condition flags
These instructions:
• update the N and Z flags according to the result can update the C flag during the calculation of
Operand2
• do not affect the V flag.
Examples
TST R0, #0x3F8 ; Perform bitwise AND of R0 value to 0x3F8
; APSR is updated but result is discarded
TEQEQ R10, R9 ; Conditionally test if value in R10 is equal to
; value in R9, APSR is updated but result is discarded.
Restrictions
In these instructions, do not use SP and do not use PC.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Condition flags
If S is specified, the MUL instruction:
• updates the N and Z flags according to the result
• does not affect the C and V flags.
Operation
The UMULL instruction interprets the values from Rn and Rm as unsigned integers. It multiplies
these integers and places the least significant 32 bits of the result in RdLo, and the most
significant 32 bits of the result in RdHi.
The UMLAL instruction interprets the values from Rn and Rm as unsigned integers. It multiplies
these integers, adds the 64-bit result to the 64-bit unsigned integer contained in RdHi and RdLo,
and writes the result back to RdHi and RdLo.
The SMULL instruction interprets the values from Rn and Rm as two’s complement signed
integers.
It multiplies these integers and places the least significant 32 bits of the result in RdLo and the
most significant 32 bits of the result in RdHi.
The SMLAL instruction interprets the values from Rn and Rm as two’s complement signed
integers. It multiplies these integers, adds the 64-bit result to the 64-bit signed integer contained
in RdHi and RdLo, and writes the result back to RdHi and RdLo.
Restrictions
In these instructions:
• do not use SP and do not use PC
• RdHi and RdLo must be different registers.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Condition flags
These instructions do not affect the condition code flags.
Examples
UMULL R0, R4, R5, R6 ; Unsigned (R4,R0) = R5 x R6
SMLAL R4, R5, R3, R8 ; Signed (R5,R4) = (R5,R4) + R3 x R8
Restrictions
Do not use SP and do not use PC
Condition flags
These instructions do not change the flags.
Examples
SDIV R0, R2, R4 ; Signed divide, R0 = R2/R4
UDIV R8, R8, R1 ; Unsigned divide, R8 = R8/R1.
Operation
BFC clears a bit field in a register. It clears width bits in Rd, starting at the low bit position lsb.
Other bits in Rd are unchanged.
BFI copies a bit field into one register from another register. It replaces width bits in Rd starting
at the low bit position lsb, with width bits from Rn starting at bit[0]. Other bits in Rd are
unchanged.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not affect the flags.
Examples
BFC R4, #8, #12 ; Clear bit 8 to bit 19 (12 bits) of R4 to 0
;If R4=0x3456789F after execution R4= 0x3450009F
BFI R9, R2, #8, #12 ;Replace bit 8 to bit 19 (12 bits) of R9 with bit 0 to bit 11 from R2.
; BE R2= 0xdcbc4533, R9= 0x34221dbd
;AE R2= same as previous, R9= 0x342533bd
Operation
SBFX extracts a bitfield from one register, sign extends it to 32 bits and writes the result to
destination register.
UBFX extracts a bitfield from one register, zero extends it to 32 bits and writes the result to the
destination register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not affect the flags.
Examples
SBFX R0, R1, #20, #4 ; Extract bit 20 to bit 23 (4 bits) from R1 and sign
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
UBFX R8, R11, #8, #10 ; Extract bit 8 to bit 18 (10 bits) from R11 and zero
; extend to 32 bits and then write the result to R8.
; If R11= 0x3496FBFF after execution , AE R8=0x 000002FB
Operation
These instructions do the following:
1. Rotate the value from Rm right by 0, 8, 16 or 24 bits.
2. Extract bits from the resulting value:
•SXTB extracts bits[7:0] and sign extends to 32 bits.
•UXTB extracts bits[7:0] and zero extends to 32 bits.
•SXTH extracts bits[15:0] and sign extends to 32 bits.
•UXTH extracts bits[15:0] and zero extends to 32 bits.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not affect the flags.
Examples
SXTH R4, R6, ROR #16 ; Rotate R6 right by 16 bits, then obtain the lower half word of the
;result and then sign extend to 32 bits and write the result to R4.
; If R6= 0x3496FBFF after execution, R4=0x 00003496
UXTB R3, R10 ; Extract lowest byte of the value in R10 and zero
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Condition flags
These instructions do not change the flags.
VS Overflow V set
VC No overflow V clear
HI Unsigned higher C set and Z clear
LS Unsigned lower or same C clear or Z set
GE Signed greater than or equal N set and V set, or N clear and V clear (N == V)
LT Signed less than N set and V clear, or N clear and V set (N != V)
GT Signed greater than Z clear, and either N set and V set, or N clear and
V clear (Z == 0, N == V)
LE Signed less than or equal Z set, or N set and V clear, or N clear and V set
(Z == 1 or N != V)
AL Always (unconditional) —
Examples
B loopA ; Branch to loopA
BLE ng ; Conditionally branch to label ng
B.W target ; Branch to target within 16MB range
BEQ target ; Conditionally branch to target
BL funC ; Branch with link (Call) to function funC, return address stored in LR
BX LR ; Return from function call
BXNE R0 ; Conditionally branch to address stored in R0
BLX R0 ; Branch with link and exchange (Call) to the address stored in R0.
Operation
Use the CBZ or CBNZ instructions to avoid changing the condition code flags and to reduce the
number of instructions.
CBZ Rn, label ---does not change condition flags but is otherwise equivalent to:
CMP Rn, #0
BEQ label
CBNZ Rn, label --- does not change condition flags but is otherwise equivalent to:
CMP Rn, #0
BNE label
Condition flags
These instructions do not change the flags.
Examples
CBZ R5, target ; Forward branch if R5 is zero
CBNZ R0,target ; Forward branch if R0 is not zero.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Syntax
IT{x{y{z}}} cond
where:
x specifies the condition switch for the second instruction in the IT block.
y specifies the condition switch for the third instruction in the IT block.
z specifies the condition switch for the fourth instruction in the IT block.
cond specifies the condition for the first instruction in the IT block.
IT<x><y><z> <cond> ; IT instruction (<x>, <y>,
; <z> can be T or E)
instr1<cond> <operands> ; 1 st instruction (<cond> must be same as IT)
instr2<cond or not cond> <operands> ; 2 nd instruction (can be <cond> or <!cond>
instr3<cond or not cond> <operands> ; 3 rd instruction (can be <cond> or <!cond>
instr4<cond or not cond> <operands> ; 4 th instruction (can be <cond> or <!cond>
The structure for the IT is “IF-Then- (Else)” and syntax of two letters is TE is
IT refers to If-Then (next instruction is conditional)
ITT refers to If-Then-Then (next two instruction are conditional)
ITE refers to If-Then-Else (next two instruction are conditional)
ITTE refers to If-Then-Then-Else (next three instruction are conditional)
ITTEE refers to If-Then-Then-Else-Else (next four instruction are conditional)
The condition switch for the second, third and fourth instruction in the IT block can be either
T Then. Applies the condition cond to the instruction.
E Else. Applies the inverse condition of cond to the instruction.
Note
It is possible to use AL (the always condition) for cond in an IT instruction. If this is done, all
the instructions in the IT block must be unconditional, and each of x, y and z must be T or
omitted but not E.
Operation
The IT instruction makes up to four following instructions conditional. The conditions can be
all the same, or some of them can be the logical inverse of the others. The conditional
instructions following the IT instruction form the IT block.
The instructions in the IT block, including any branches, must specify the condition in the
{cond} part of their syntax.
A BKPT instruction in an IT block is always executed, even if its condition fails.
Restrictions
The following instructions are not permitted in an IT block:
•IT (nested IT)
• CBZ and CBNZ
• CPSID and CPSIE
• MOVS.N Rd, Rm.
A branch or any instruction that modifies the PC must either be outside an IT block or must
be the last instruction inside the IT block. These are:
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
if (R1<R2) then
R2=R2-R1
R2=R2/2
else
R1=R1-R2
R1=R1/2
In assembly,
CMP R1, R2 ; If R1 < R2 (less then)
ITTEE LT ; then execute instruction 1 and 2 (indicated by T)
; else execute instruction 3 and 4 (indicated by E)
SUBLT.W R2,R1 ; 1 st instruction
LSRLT.W R2,#1 ; 2 nd instruction
SUBGE.W R1,R2 ; 3 rd instruction (notice the GE is opposite of LT)
LSRGE.W R1,#1 ; 4 th instruction
If an exception occurs during the IT instruction block, the execution status of the block will be
stored in the stacked PSR (in the IT/Interrupt-Continuable Instruction [ICI] bit field). So, when
the exception handler completes and the IT block resumes, the rest of the instructions in the
block can continue the execution correctly. In the case of using multicycle instructions (for
example, multiple load and store) inside an IT block, if an exception takes place during the
execution, the whole instruction is abandoned and restarted after the interrupt process is
completed.
CMP R0, #9 ; Convert R0 hex value (0 to 15) into ASCII ; ('0'-'9', 'A'-'F')
ITE GT ; Next 2 instructions are conditional
ADDGT R1, R0, #55 ; Convert 0xA -> 'A'
ADDLE R1, R0, #48 ; Convert 0x0 -> '0'
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Syntax
TBB [Rn, Rm]
TBH [Rn, Rm, LSL #1]
where:
Rn Specifies the register containing the address of the table of branch lengths.
If Rn is PC, then the address of the table is the address of the byte immediately following the
TBB or TBH instruction.
Rm Specifies the index register. This contains an index into the table. For half word tables, LSL
#1 doubles the value in Rm to form the right offset into the table.
Operation
These instructions cause a PC-relative forward branch using a table of single byte offsets for
TBB, or half word offsets for TBH. Rn provides a pointer to the table, and Rm supplies an index
into the table. For TBB the branch offset is twice the unsigned value of the byte returned from
the table and for TBH the branch offset is twice the unsigned value of the half word returned
from the table. The branch occurs to the address at that offset from the address of the byte
immediately after The TBB or TBH instruction.
Restrictions
The restrictions are:
•Rn must not be SP
•Rm must not be SP and must not be PC
• when any of these instructions is used inside an IT block, it must be the last instruction of
the IT block.
Condition flags
These instructions do not change the flags.
Examples
1. ADR.W R0, BranchTable_Byte
TBB [R0, R1] ; R1 is the index, R0 is the base address of the
; branch table
Case1 ; an instruction sequence follows
Case2 ; an instruction sequence follows
Case3 ; an instruction sequence follows
BranchTable_Byte
DCB 0 ; Case1 offset calculation
DCB ((Case2-Case1)/2) ; Case2 offset calculation
DCB ((Case3-Case1)/2) ; Case3 offset calculation
TBH [PC, R1, LSL #1] ; R1 is the index, PC is used as base of the
; branch table
2. BranchTable_H
DCI ((CaseA - BranchTable_H)/2) ; CaseA offset calculation
DCI ((CaseB - BranchTable_H)/2) ; CaseB offset calculation
DCI ((CaseC - BranchTable_H)/2) ; CaseC offset calculation
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
CaseA
; an instruction sequence follows
CaseB
; an instruction sequence follows
CaseC
; an instruction sequence follows
3. The table branch byte instruction loads a byte from (Rn + Rm) and adds twice its value to
the program counter.
TBB [PC,R0]
table dcb (case0 - table) >> 1 We divide by 2 here because the instruction will multiply by 2
dcb (case1 - table) >> 1
dcb (case2 - table) >> 1
align Align here because instructions must start at an even address
case0 nop If R0 = 0 we arrive here
case1 nop If R0 = 1 we arrive here
case2 nop If R0 = 2 we arrive here
1. BKPT Breakpoint.
Syntax
BKPT #imm
where:
imm is an expression evaluating to an integer in the range 0-255 (8-bit value).
Operation
The BKPT instruction causes the processor to enter Debug state. Debug tools can use this to
investigate system state when the instruction at a particular address is reached.
imm is ignored by the processor. If required, a debugger can use it to store additional information
about the breakpoint. The BKPT instruction can be placed inside an IT block, but it executes
unconditionally, unaffected by the condition specified by the IT instruction.
Condition flags
This instruction does not change the flags.
Examples
BKPT #0x3 ; Breakpoint with immediate value set to 0x3 (debugger can
; extract the immediate value by locating it using the PC)
2. CPS Change Processor State. CPS changes the PRIMASK and FAULTMASK special
register values.
Syntax
CPS effect iflags
where:
effect Is one of:
IE Clears the special purpose register.
ID Sets the special purpose register.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Restrictions
The restrictions are:
•Use CPS only from privileged software, it has no effect if used in unprivileged software
•CPS cannot be conditional and so must not be used inside an IT block.
Condition flags
This instruction does not change the condition flags.
Examples
CPSID i ; Disable interrupts and configurable fault handlers (set PRIMASK)
CPSID f ; Disable interrupts and all fault handlers (set FAULTMASK)
CPSIE i ; Enable interrupts and configurable fault handlers (clear PRIMASK)
CPSIE f ; Enable interrupts and fault handlers (clear FAULTMASK).
DSB is useful when memory mapping is being switched by a hardware register, after the
memory register is written DSB should be used because if the successive instruction is memory
read then if the switching is incomplete it will access old memory. DSB makes further
insrtructions wait until the previous instruction is complete(switching).
Condition flags
This instruction does not change the flags.
Examples
DSB ; Data Synchronisation Barrier
6. MRS
Move the contents of a special register to a general-purpose register.
Syntax
MRS{cond} Rd, spec_reg
where:
cond Is an optional condition code.
Rd Specifies the destination register.
spec_reg can be any of: APSR, IPSR, EPSR, MSP, PSP, PRIMASK, BASEPRI, FAULTMASK,
or CONTROL.
Note
All the EPSR and IPSR fields are zero when read by the MRS instruction.
Operation
Use MRS in combination with MSR as part of a read-modify-write sequence for updating a PSR,
for example to clear the Q flag.
Restrictions
Rd must not be SP and must not be PC.
Condition flags
This instruction does not change the flags.
Examples
MRS R0, PRIMASK ; Read PRIMASK value and write it to R0.
7. MSR
Move the contents of a general-purpose register into the specified special register.
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Syntax
MSR{cond} spec_reg, Rn
where:
cond Is an optional condition code.
Rn Specifies the source register.
spec_reg can be any of: APSR, IPSR, EPSR, MSP, PSP, PRIMASK, BASEPRI, FAULTMASK,
or CONTROL.
Operation
The register access operation in MSR depends on the privilege level. Unprivileged software can
only access the APSR. Privileged software can access all special registers.
In unprivileged software writes to unallocated or execution state bits in the PSR are ignored.
Restrictions
Rn must not be SP and PC.
Condition flags
This instruction updates the flags explicitly based on the value in Rn.
Examples
MSR CONTROL, R1 ; Read R1 value and write it to the CONTROL register.
8. NOP
No Operation.
Syntax
NOP{cond}
where:
cond Is an optional condition code.
Operation
NOP does nothing. NOP is not necessarily a time-consuming NOP. The processor might remove
it from the pipeline before it reaches the execution stage.
Use NOP for padding, for example to adjust the alignment of a following instruction.
Condition flags
This instruction does not change the flags.
Examples
NOP ; No operation
Condition flags
This instruction does not change the flags.
Examples
SEV ; Send Event
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
Condition flags
This instruction does not change the flags.
Examples
WFE ; Wait For Event
Saturating instructions
If the returned result is different from the value to be saturated, it is called saturation. If
Embedded Systems (18EC62) Dr Archana R Kulkarni, Asso Prof, ECE Dept , RNSIT
saturation occurs, the instruction sets the Q flag to 1 in the APSR. Otherwise, it leaves the Q flag
unchanged. To clear the Q flag to 0, you must use the MSR instruction.
To read the state of the Q flag, use the MRS instruction.
Restrictions
Do not use SP and do not use P
Condition flags
These instructions do not affect the condition code flags.
If saturation occurs, these instructions set the Q flag to 1.
Examples
SSAT R7, #16, R7, LSL #4 ; Logical shift left value in R7 by 4, then
; saturate it as a signed 16-bit value and
; write it back to R7
USATNE R0, #7, R5 ; Conditionally saturate value in R5 as an
; unsigned 7 bit value and write it to R0.