Chapters
4
Arithmetic & Logic
Embedded Systems with ARM Cortext-M
Updated: Wednesday, February 7, 2018
Overview: Arithmetic and Logic Instructions
• Shift
• LSL (logic shift left), LSR (logic shift right), ASR (arithmetic shift right), ROR (rotate right), RRX (rotate right with extend)
• Logic
• AND (bitwise and), ORR (bitwise or), EOR (bitwise exclusive or), ORN (bitwise or not), MVN (move not)
• Bit set/clear
• BFC (bit field clear), BFI (bit field insert), BIC (bit clear), CLZ (count leading zeroes)
• Bit/byte reordering
• RBIT (reverse bit order in a word), REV (reverse byte order in a word), REV16 (reverse byte order in each half-word independently), REVSH (reverse byte order in each half-word
independently)
• Addition
• ADD, ADC (add with carry)
• Subtraction
• SUB, RSB (reverse subtract), SBC (subtract with carry)
• Multiplication
• MUL (multiply), MLA (multiply-accumulate), MLS (multiply-subtract), SMULL (signed long multiply-accumulate), SMLAL (signed long multiply-accumulate), UMULL (unsigned long multiply-
subtract), UMLAL (unsigned long multiply-subtract)
• Division
• SDIV (signed), UDIV (unsigned)
• Saturation
• SSAT (signed), USAT (unsigned)
• Sign extension
• SXTB (signed), SXTH, UXTB, UXTH
• Bit field extract
• SBFX (signed), UBFX (unsigned)
• Syntax
2
• <Operation>{<cond>}{S} Rd, Rn, Operand2
ADD {Rd,} Rn, Op2 Add. Rd ¬ Rn + Op2 AND {Rd,} Rn, Op2 Bitwise logic AND. Rd ¬ Rn & operand2
ADC {Rd,} Rn, Op2 Add with carry. Rd ¬ Rn + Op2 + Carry ORR {Rd,} Rn, Op2 Bitwise logic OR. Rd ¬ Rn | operand2
SUB {Rd,} Rn, Op2 Subtract. Rd ¬ Rn - Op2
Bitwise logic exclusive OR. Rd ¬ Rn ^
SBC {Rd,} Rn, Op2 Subtract with carry. Rd ¬ Rn - Op2 + Carry - 1 EOR {Rd,} Rn, Op2
operand2
RSB {Rd,} Rn, Op2 Reverse subtract. Rd ¬ Op2 - Rn
MUL {Rd,} Rn, Rm Multiply. Rd ¬ (Rn × Rm)[31:0] Bitwise logic NOT OR. Rd ¬ Rn | (NOT
ORN {Rd,} Rn, Op2
Multiply with accumulate. operand2)
MLA Rd, Rn, Rm, Ra
Rd ¬ (Ra + (Rn × Rm))[31:0] BIC {Rd,} Rn, Op2 Bit clear. Rd ¬ Rn & NOT operand2
MLS Rd, Rn, Rm, Ra Multiply and subtract, Rd ¬ (Ra – (Rn × Rm))[31:0] BFC Rd, #lsb, #width Bit field clear. Rd[(width+lsb–1):lsb] ¬ 0
SDIV {Rd,} Rn, Rm Signed divide. Rd ¬ Rn / Rm BFI Rd, Rn, #lsb, Bit field insert.
UDIV {Rd,} Rn, Rm Unsigned divide. Rd ¬ Rn / Rm
#width Rd[(width+lsb–1):lsb] ¬ Rn[(width-1):0]
SSAT Rd, #n, Rm {,shift #s} Signed saturate
Move NOT, logically negate all bits.
USAT Rd, #n, Rm {,shift #s} Unsigned saturate MVN Rd, Op2
Rd ¬ 0xFFFFFFFF EOR Op2
Unsigned long multiply. RdHi,RdLo ¬ RBIT Rd, Rn Reverse bit order in a word.
UMULL RdLo, RdHi, Rn, Rm for (i = 0; i < 32; i++) Rd[i] ¬ RN[31– i]
unsigned(Rn × Rm)
REV Rd, Rn Reverse byte order in a word.
Signed long multiply. RdHi,RdLo ¬ signed(Rn × Rd[31:24] ¬ Rn[7:0], Rd[23:16] ¬ Rn[15:8],
SMULL RdLo, RdHi, Rn, Rm
Rm) Rd[15:8] ¬ Rn[23:16], Rd[7:0] ¬ Rn[31:24]
Unsigned multiply with accumulate. REV16 Rd, Rn Reverse byte order in each half-word.
UMLAL RdLo, RdHi, Rn, Rm Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
RdHi,RdLo ¬ unsigned(RdHi,RdLo + Rn × Rm) Rd[31:24] ¬ Rn[23:16], Rd[23:16] ¬ Rn[31:24]
Signed multiply with accumulate. REVSH Rd, Rn Reverse byte order in bottom half-word and sign
SMLAL RdLo, RdHi, Rn, Rm extend.
RdHi,RdLo ¬ signed(RdHi,RdLo + Rn × Rm)
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:16] ¬ Rn[7] & 0xFFFF
3
Data Movement
MOV Rd ¬ operand2
MVN Rd ¬ NOT operand2
MRS Rd, spec_reg Move from special register to general register
MSR spec_reg, Rm Move from general register to special register
MOV r4, r5 ; Copy r5 to r4
MVN r4, r5 ; r4 = bitwise logical NOT of r5
MOV r1, r2, LSL #3 ; r1 = r2 << 3
MOV r0, PC ; Copy PC (r15) to r0
MOV r1, SP ; Copy SP (r14) to r1
4
Commonly Used Arithmetic Operations
ADD {Rd,} Rn, Op2 Add. Rd ¬ Rn + Op2
ADC {Rd,} Rn, Op2 Add with carry. Rd ¬ Rn + Op2 + Carry
SUB {Rd,} Rn, Op2 Subtract. Rd ¬ Rn - Op2
SBC {Rd,} Rn, Op2 Subtract with carry. Rd ¬ Rn - Op2 + Carry - 1
RSB {Rd,} Rn, Op2 Reverse subtract. Rd ¬ Op2 - Rn
MUL {Rd,} Rn, Rm Multiply. Rd ¬ (Rn × Rm)[31:0]
Multiply with accumulate.
MLA Rd, Rn, Rm, Ra
Rd ¬ (Ra + (Rn × Rm))[31:0]
MLS Rd, Rn, Rm, Ra Multiply and subtract, Rd ¬ (Ra – (Rn × Rm))[31:0]
SDIV {Rd,} Rn, Rm Signed divide. Rd ¬ Rn / Rm
UDIV {Rd,} Rn, Rm Unsigned divide. Rd ¬ Rn / Rm
SSAT Rd, #n, Rm {,shift #s} Signed saturate
USAT Rd, #n, Rm {,shift #s} Unsigned saturate
5
Example: Add
• Unified Assembler Language (UAL) Syntax
ADD r1, r2, r3 ; r1 = r2 + r3
ADD r1, r2, #4 ; r1 = r2 + 4
• Traditional Thumb Syntax
ADD r1, r3 ; r1 = r1 + r3
ADD r1, #15 ; r1 = r1 + 15
6
Example:
S: Set Condition Flags
start
LDR r0, =0xFFFFFFFFF
LDR r1, =0x00000001
ADDS r0, r0, r1
stop B stop
• For most instructions, we can add a
suffix S to update the NZCV bits of
the APSR register.
• In this example, the Z and C bits are
set.
7
16 Processor Registers
13 for general purpose
ARM Register and ALU 3 for specific purpose
Processor Registers
32 bits } Fastest way to read and write
} Registers are within the processor chip
R0 } A register stores 32-bit value
R1
} Cortex M (STM32L) has
R2
} R0-R12: 13 general-purpose registers
Low R3
Registers
R4 } R13: Stack pointer (Shadow of MSP or PSP)
R5 } R14: Link register (LR)
General
R6 Purpose } R15: Program counter (PC)
Register
R7 } Special registers (xPSR, BASEPRI, PRIMASK, etc.)-
R8 more later
R9
High
32 bits
Registers R10
R11 xPSR
R12 BASEPRI
Special
R13 (SP) R13 (MSP) R13 (PSP) PRIMASK Purpose
Register
R14 (LR) FAULTMASK
R15 (PC) CONTROL
9
Program Status Register
• Application PSR (APSR), Interrupt PSR (IPSR), Execution PSR (EPSR)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
APSR N Z C V Q Reserved GE Reserved
IPSR Reserved ISR number
EPSR ICI/IT T Reserved ICI/IT
Note:
• GE flags are only available on Cortex-M4 and M7
10
Program Status Register
• Application PSR (APSR), Interrupt PSR (IPSR), Execution PSR (EPSR)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
APSR N Z C V Q Reserved GE Reserved
IPSR Reserved ISR number
EPSR ICI/IT T Reserved ICI/IT
Combine them together into one register (PSR)
PSR N Z C V Q ICI/IT T Reserved GE Reserved ICI/IT ISR number
Note:
• GE flags are only available on Cortex-M4 and M7
• Use PSR in code
11
Example: 64-bit Addition
Most-significant (Upper) 32 bits Least-significant (Lower) 32 bits
0 0 0 0 0 0 0 2 F F F F F F F F
0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 1
+
0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0
Carry out
• A register can only store 32 bits
• A 64-bit integer needs two registers
• Split 64-bit addition into two 32-bit additions
12
Example: 64-bit Addition
start
; C = A + B
; Two 64-bit integers A (r1,r0) and B (r3, r2).
; Result C (r5, r4)
; A = 00000002FFFFFFFF
; B = 0000000400000001
LDR r0, =0xFFFFFFFF ; A’s lower 32 bits
LDR r1, =0x00000002 ; A’s upper 32 bits
LDR r2, =0x00000001 ; B’s lower 32 bits
LDR r3, =0x00000004 ; B’s upper 32 bits
; Add A and B
ADDS r4, r2, r0 ; C[31..0] = A[31..0] + B[31..0], update Carry
ADC r5, r3 r1 ; C[64..32] = A[64..32] + B[64..32] + Carry
stop B stop
13
Example: 64-bit Subtraction
start
; C = A - B
; Two 64-bit integers A (r1,r0) and B (r3, r2).
; Result C (r5, r4)
; A = 00000002FFFFFFFF
; B = 0000000400000001
LDR r0, =0xFFFFFFFF ; A’s lower 32 bits
LDR r1, =0x00000002 ; A’s upper 32 bits
LDR r2, =0x00000001 ; B’s lower 32 bits
LDR r3, =0x00000004 ; B’s upper 32 bits
; Subtract B from A
SUBS r4, r0, r2 ; C[31..0]= A[31..0] - B[31..0], update Carry
SBC r5, r1, r3 ; C[64..32]= A[64..32] - B[64..32] - Carry
stop B stop
14
Example: Long Multiplication
UMULL RdLo, RdHi, Rn, Rm Unsigned long multiply. RdHi,RdLo ¬ unsigned(Rn × Rm)
SMULL RdLo, RdHi, Rn, Rm Signed long multiply. RdHi,RdLo ¬ signed(Rn × Rm)
Unsigned multiply with accumulate.
UMLAL RdLo, RdHi, Rn, Rm
RdHi,RdLo ¬ unsigned(RdHi,RdLo + Rn × Rm)
Signed multiply with accumulate.
SMLAL RdLo, RdHi, Rn, Rm
RdHi,RdLo ¬ signed(RdHi,RdLo + Rn × Rm)
UMULL r3, r4, r0, r1 ; r4:r3 = r0 ´ r1, r4 = MSB bits, r3 = LSB bits
SMULL r3, r4, r0, r1 ; r4:r3 = r0 ´ r1
UMLAL r3, r4, r0, r1 ; r4:r3 = r4:r3 + r0 ´ r1
SMLAL r3, r4, r0, r1 ; r4:r3 = r4:r3 + r0 ´ r1
15
Example: Short Multiplication and Division
; MUL: Signed multiply
MUL r6, r4, r2 ; r6 = LSB32( r4 × r2 )
; UMUL: Unsigned multiply
UMUL r6, r4, r2 ; r6 = LSB32( r4 × r2 )
; MLA: Multiply with accumulation
MLA r6, r4, r1, r0 ; r6 = LSB32( r4 × r1 ) + r0
; MLS: Multiply with subtract
MLS r6, r4, r1, r0 ; r6 = LSB32( r4 × r1 ) - r0
16
Bitwise Logic
AND {Rd,} Rn, Op2 Bitwise logic AND. Rd ¬ Rn & operand2
ORR {Rd,} Rn, Op2 Bitwise logic OR. Rd ¬ Rn | operand2
EOR {Rd,} Rn, Op2 Bitwise logic exclusive OR. Rd ¬ Rn ^ operand2
ORN {Rd,} Rn, Op2 Bitwise logic NOT OR. Rd ¬ Rn | (NOT operand2)
BIC {Rd,} Rn, Op2 Bit clear. Rd ¬ Rn & NOT operand2
BFC Rd, #lsb, #width Bit field clear. Rd[(width+lsb–1):lsb] ¬ 0
Bit field insert.
BFI Rd, Rn, #lsb, #width
Rd[(width+lsb–1):lsb] ¬ Rn[(width-1):0]
Move NOT, logically negate all bits.
MVN Rd, Op2
Rd ¬ 0xFFFFFFFF EOR Op2
17
Example: AND r2, r0, r1
32 bits
r0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
r1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1
r2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Bit-wise Logic AND
18
Example: ORR r2, r0, r1
32 bits
r0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
r1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1
r2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bit-wise Logic OR
19
Example: BIC r2, r0, r1
Bit Clear
r2 = r0 & NOT r1
Step 1:
r1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
NOT r1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
Step 2:
r0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
NOT r1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
r2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
20
Example: BFC and BFI
• Bit Field Clear (BFC) and Bit Field Insert (BFI).
• Syntax
• BFC Rd, #lsb, #width
• BFI Rd, Rn, #lsb, #width
• Examples:
BFC R4, #8, #12
; Clear bit 8 to bit 19 (12 bits) of R4 to 0
BFI R9, R2, #8, #12
; Replace bit 8 to bit 19 (12 bits) of R9 with bit 0 to bit 11 from R2.
21
Bit Operators (&, |, ~) vs
Boolean Operators (&& ,||, !)
A && B Boolean and A & B Bitwise and
A||B Boolean or A|B Bitwise or
!B Boolean not ~B Bitwise not
• The Boolean operators perform word-wide operations, not
bitwise.
• For example,
• “0x10 & 0x01” = 0x00, but “0x10 && 0x01” = 0x01.
• “~0x01” = 0xFFFFFFFE, but “!0x01” = 0x00.
22
Reverse Order
RBIT Rd, Rn Reverse bit order in a word.
for (i = 0; i < 32; i++) Rd[i] ¬ RN[31– i]
REV Rd, Rn Reverse byte order in a word.
Rd[31:24] ¬ Rn[7:0], Rd[23:16] ¬ Rn[15:8],
Rd[15:8] ¬ Rn[23:16], Rd[7:0] ¬ Rn[31:24]
REV16 Rd, Rn Reverse byte order in each half-word.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:24] ¬ Rn[23:16], Rd[23:16] ¬ Rn[31:24]
REVSH Rd, Rn Reverse byte order in bottom half-word and sign extend.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:16] ¬ Rn[7] & 0xFFFF
RBIT Rd, Rn
Rn
Rd
Example:
LDR r0, =0x12345678 ; r0 = 0x12345678
RBIT r1, r0 ; Reverse bits, r1 = 0x1E6A2C48
23
Reverse Order
RBIT Rd, Rn Reverse bit order in a word.
for (i = 0; i < 32; i++) Rd[i] ¬ RN[31– i]
REV Rd, Rn Reverse byte order in a word.
Rd[31:24] ¬ Rn[7:0], Rd[23:16] ¬ Rn[15:8],
Rd[15:8] ¬ Rn[23:16], Rd[7:0] ¬ Rn[31:24]
REV16 Rd, Rn Reverse byte order in each half-word.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:24] ¬ Rn[23:16], Rd[23:16] ¬ Rn[31:24]
REVSH Rd, Rn Reverse byte order in bottom half-word and sign extend.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:16] ¬ Rn[7] & 0xFFFF
REV Rd, Rn
Rn
Rd
Example:
LDR R0, =0x12345678
REV R1, R0 ; R1 = 0x78563412
24
Reverse Order
RBIT Rd, Rn Reverse bit order in a word.
for (i = 0; i < 32; i++) Rd[i] ¬ RN[31– i]
REV Rd, Rn Reverse byte order in a word.
Rd[31:24] ¬ Rn[7:0], Rd[23:16] ¬ Rn[15:8],
Rd[15:8] ¬ Rn[23:16], Rd[7:0] ¬ Rn[31:24]
REV16 Rd, Rn Reverse byte order in each half-word.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:24] ¬ Rn[23:16], Rd[23:16] ¬ Rn[31:24]
REVSH Rd, Rn Reverse byte order in bottom half-word and sign extend.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:16] ¬ Rn[7] & 0xFFFF
REV16 Rd, Rn
Rn
Rd
Example:
LDR R0, =0x12345678
REV16 R2, R0 ; R2 = 0x34127856
25
Reverse Order
RBIT Rd, Rn Reverse bit order in a word.
for (i = 0; i < 32; i++) Rd[i] ¬ RN[31– i]
REV Rd, Rn Reverse byte order in a word.
Rd[31:24] ¬ Rn[7:0], Rd[23:16] ¬ Rn[15:8],
Rd[15:8] ¬ Rn[23:16], Rd[7:0] ¬ Rn[31:24]
REV16 Rd, Rn Reverse byte order in each half-word.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:24] ¬ Rn[23:16], Rd[23:16] ¬ Rn[31:24]
REVSH Rd, Rn Reverse byte order in bottom half-word and sign extend.
Rd[15:8] ¬ Rn[7:0], Rd[7:0] ¬ Rn[15:8],
Rd[31:16] ¬ Rn[7] & 0xFFFF
REVSH Rd, Rn
Rn
Rd
Example:
LDR R0, =0x33448899
REVSH R1, R0 ; R0 = 0xFFFF9988
26
Sign and Zero Extension
SXTB {Rd,} Rm {,ROR #n} Sign extend a byte.
Rd[31:0] ¬ Sign Extend((Rm ROR (8 × n))[7:0])
SXTH {Rd,} Rm {,ROR #n} Sign extend a half-word.
Rd[31:0] ¬ Sign Extend((Rm ROR (8 × n))[15:0])
UXTB {Rd,} Rm {,ROR #n} Zero extend a byte.
Rd[31:0] ¬ Zero Extend((Rm ROR (8 × n))[7:0])
UXTH {Rd,} Rm {,ROR #n} Zero extend a half-word.
Rd[31:0] ¬ Zero Extend((Rm ROR (8 × n))[15:0])
LDR R0, =0x55AA8765
SXTB R1, R0 ; R1 = 0x00000065
SXTH R1, R0 ; R1 = 0xFFFF8765
UXTB R1, R0 ; R1 = 0x00000065
UXTH R1, R0 ; R1 = 0x00008765
27
Sign and Zero Extension
signed int_8 a = -1; // a signed 8-bit integer, a = 0xFF
signed int_16 b = -2; // a signed 16-bit integer, b = 0xFFFE
signed int_32 c; // a signed 32-bit integer
c = a; // sign extension required, c = 0xFFFFFFFF
c = b; // sign extension required, c = 0xFFFFFFFE
28
Barrel Shifter
• The second operand of ALU has a special hardware
called Barrel shifter
• Example:
ADD r1, r0, r0, LSL #3 ; r1 = r0 + r0 << 3 = 9 × r0
29
The Barrel Shifter
Logical Shift Left (LSL) Arithmetic Shift Right (ASR)
Logical Shift Right (LSR) Rotate Right (ROR)
Rotate Right Extended (RRX)
Why is there rotate right but no
rotate left?
Rotate left can be replaced by a rotate
right with a different rotate offset.
30
Barrel Shifter
• Examples:
• ADD r1, r0, r0, LSL #3
; r1 = r0 + r0 << 3 = r0 + 8 × r0
• ADD r1, r0, r0, LSR #3
; r1 = r0 + r0 >> 3 = r0 + r0/8 (unsigned)
• ADD r1, r0, r0, ASR #3
; r1 = r0 + r0 >> 3 = r0 + r0/8 (signed)
• Use Barrel shifter to speed up the application
ADD r1, r0, r0, LSL #3 <=> MOV r2, #9 ; r2 = 9
MUL r1, r0, r2 ; r1 = r0
* 9
31