Programable PPT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 91

• Programmer’s Model

Registers
• Cortex-M3 and Cortex-M4 processors have a
number of registers inside the processor core
to perform data transfer, data processing and
control operations.

• if data in memory is to be processed,


load-store architecture is used.
Contd.
• R0 - R12 (32-bit wider registers)
• Registers R0 to R12 are general purpose
registers. The first eight (R0 - R7) are also called
low registers. Due to the limited available space
in the instruction set, many 16-bit instructions
can only access the low registers.
• The high registers (R8 to R12) can be used with
32-bit instructions, and a few with 16-bit
instructions, like MOV.
Link Register (LR)
• R14 Link Register (LR). Used for holding the return address
when calling a subroutine.
• When a function or subroutine call is made, the value of LR is
updated automatically (by hardware).
• If a subroutine needs to call another subroutine, it needs to
save the value of LR in the stack first. Otherwise, the
current value in LR will be lost when the subroutine call is
made.
• At the end of the subroutine, the program control can return to
the calling program PC LR.
Program Counter (PC)
• R15 is the Program Counter (PC). It is readable and
writeable: a read returns the current instruction
address plus 4 or 8 (this is due to the pipeline nature, and
compatibility requirement with the ARM7TDMI
processor).
• Writing to PC (e.g., using data transfer/processing
instructions) causes a branch operation.
• Since the instructions must be aligned to half-word or
word addresses, the Least Significant Bit (LSB) of the
PC is always zero.
Contd.
• Word data, occupy group of 4 bytes locations, starting at
byte address ×4 (multiple of four).
• Half-word, occupy two bytes location, even byte
address

Word 3
Stack Pointer (SP)
• Used for accessing the stack memory via PUSH and POP
operations.

• Main Stack Pointer (MSP, or SP_main) default Stack


Pointer.
(i) It is selected after reset,
(ii) Processor is in Handler Mode.
• Process Stack Pointer (PSP, or SP_process)
The PSP can only be used in Thread Mode.

• Handler mode: When executing an exception handler such as


an Interrupt Service Routine (ISR).
• Thread mode: When executing normal application code.
Contd.
• In simple applications without an OS, both Thread mode
and Handler mode can use MSP only.
Contd.
• When embedded systems use an embedded OS, they often
use separate memory areas for application stack and the
kernel stack. Need of two SP arises.

• PUSH and POP operations are always 32-bit, and the


addresses of the transfers in stack operations must be
aligned to 32-bit word (4 bytes) boundaries.

• Both MSP and PSP are 32-bit, but the lowest two bits of the
Stack Pointers are always zero, and writes to these two bits
are ignored.
Contd.
• The Cortex-M processors use a stack memory model
called Full-Descending Stack.
• The stack grows down through decreasing memory
addresses and the SP points to the lowest address
containing a valid item.
• For each PUSH operation, the processor first decrements
the SP, then stores the value in the memory location
pointed by SP.
• SP points to the memory location where the last data was
pushed to the stack.
Example

In a POP operation, the value of the memory location pointed by SP is


read, and then the SP is incremented automatically.
PUSH and POP instruction with
Multiple data transfer
Keil Example 1
• Mention the changes in different register(s)
and /or memory location(s) after executing
each instruction of the given program. Take
initial value of registers as: R1 = 0x40000000,
R2 = 0x20, R3= 0x124, Intial SP=
0x20000400

PUSH {R1-R3}
4.2.3 Special registers
• Besides the registers in the register bank, there are a
number of special registers.

• These registers contain the processor status and


define the operation states and interrupt/exception
masking.

• Special registers are not memory mapped, and can be


accessed using special register access instructions
such as MSR and MRS.
Contd.
Program status registers

The Program Status Register is composed of


three status registers:

•Application PSR (APSR)


•Execution PSR (EPSR)
•Interrupt PSR (IPSR)
xPSR
Q -Saturation Flag
• The Q is used to indicate an occurrence of
saturation during saturation arithmetic operations or
saturation adjustment operations.

• 0xFFFFFFFE + 0×00000002 = 0×100000001 which


contain 33 binary bits. If the same arithmetic is done
in a 32- bit processor, the carry flag will be set
Contd.
• Normal variable reaches its maximum value if it
increment further, it will roll round to zero. Similarly, if
a variable reaches its minimum value and is then
decremented, it will roll round to the maximum value.

• Serious for applications such as motor control and safety


critical applications.
Contd.
• Saturation is commonly used in signal processing. For
example, after certain operations such as amplification,
the amplitude of a signal can exceed the maximum
allowed output range. If the value is adjusted by simply
cutting off the MSB bits, the resulted signal waveform
could be completely distorted.
Contd.
• Instead of just cutting off the MSB, saturation
arithmetic forces the result to the maximum value (in
case of overflow) or minimum value (in case of
underflow) to reduce the impact of signal distortion.

• Saturate instructions are very useful in implementing


certain DSP algorithms like audio processing.

• The actual maximum and minimum values that trigger


the saturation depend on the instructions being used. If
saturation occurred, the Q bit is set; otherwise, the value
of the Q bit is unchanged.
Contd.
• The Cortex-M3 processor supports two
instructions that provide saturation
adjustment of signed and unsigned data.

• They are SSAT (for signed data) and USAT


(for unsigned data).
SSAT & USAT
• SSAT <Rd>, #<immed>, <Rn>, {,<shift>} ; Saturation for
signed value
• USAT <Rd>, #<immed>, <Rn>, {,<shift>} ; Saturation
for a signed value into unsigned value

• <Rn> : input value


• <shift> : optional shift operation for input value before
saturation. It can be #LSL N or #ASR N
• <immed> : Bit position where the saturation is carried out

• <Rd> : destination register


Contd.

• SSAT instruction applies the specified shift, then


saturates to the signed
range: −2n-1 ≤ x ≤ 2n-1 −1.

• USAT instruction applies the specified shift, then


saturates to the unsigned
range 0 ≤ x ≤ 2n −1.
Example-SSAT
• If a 32-bit signed value is to be saturated into a 16-bit signed
value, the following instruction can be used:

• SSAT R1, #16, R0

• It saturates the result to the signed range


• –215 ≤ R1 ≤ 215 –1. (-32768 to 32767)
(0xFFFF8000 to 0x00007FFF)

• The Q flag is set if saturation takes place in the operation and


it can be cleared by writing to the APSR.
Examples

0xFFFFFF00 Q-bit?`
The Q bit is a “sticky” bit and saturation arithmetic/adjustment
operations do not clear this bit. It can be cleared by writing to
APSR.
Example- USAT

• USAT is slightly different in that the result is an


unsigned value.
• Convert a 32-bit signed value to a 16-bit unsigned
value using:

• USAT R1, #16, R0


• It saturates the result to the unsigned range
• 0 to 65536 (0 to 0x0000FFFF)
Contd.
Integer Status Flag
Examples
Contd.
• MRS r0, APSR ; Read Flag state into R0
• MSR APSR, r0 ; Write Flag state

• MRS r0, PSR ; Read the combined program


status word
• MSR PSR, r0 ; Write combined program state
word
xPSR
Conditional Execution (If-Then
instruction)
• Cortex-M Processor, supports Conditional Execution of an
instruction.

• With the exception of simple conditional branches, Thumb2


instructions do not have 4-bit condition code filed that most
ARM instruction have instead.

• Thumb-2 has IT (IF-THEN) instruction, which conditionally


executes up to four subsequent instructions. The instructions
affected by an IT instruction are said to be in an IT block.

• The conditional execution instructions can be data processing


instructions or memory access instructions
Contd.
• The IT instruction syntax contains the IT instruction
opcode with up to an additional three optional suffixes of
“T” (then) and “E” (else), followed by the condition to
check against
• i.e. ITE EQ

• IT represents if-then construct, if the condition code


evaluates to true, then next instruction is executed.

• ITE referred as if-then-else and ITTEE as


if-then-then-else-else
Example 1
CMP R0, #1 ; IF
ITE EQ ; IT block
MOVEQ R3, #2 ; THEN
MOVNE R3, #1 ; ELSE

The then condition must match the condition


code, and any else conditions must be opposite
condition.
Contd.
• Note that when the “E” suffix is used, the execution condition
for the corresponding instruction in the IT block must be the
inverse of the condition specified by the IT instruction.

• Assemblers will check that condition to IT is consistent with


those on the individual instructions. The then conditions must
match the condition code, and any else conditions must be
opposite condition.

• The “T”/”E” indicates how many subsequence instructions


are inside the IT instruction block, and whether they should
or should not be executed if the condition is met.
Possible Combinations

• Different combinations of “T” and “E”


sequence are possible:
• Just one conditional execution instruction: IT
• Two conditional execution instructions: ITT,
ITE
• Three conditional execution instructions:
ITTT, ITTE, ITET, ITEE
• Four conditional execution instructions:
ITTTT, ITTTE, ITTET, ITTEE, ITETT, ITETE, ITEET,
ITEEE
Example- IT Block
if (R1<R2) then • CMP R1, R2 ; // IF
R2=R2-R1 ITTEE LT ;
R2=R2/2 SUBLT R2,R1 ; // THEN
else LSRLT R2,#1 ;
R1=R1-R2 SUBGE R1,R2 ; //ELSE
R1=R1/2 LSRGE R1,#1
Conditions
Contd.
• In many cases, the IT instruction can help improve the
performance of program code significantly because it
avoids some of the branch penalty, as well as reducing
the number of branch instructions.
xPSR
Interrupt-Continuable Instruction (ICI) bits

• Cortex-M3 and Cortex-M4 processors allow exceptions


to be taken in the middle of Multiple Load and Store
instructions (LDM/STM) and stack PUSH/POP
instructions.

• If one of these LDM/STM/PUSH/POP instructions is


executing when the interrupt request arrives, the
current memory accesses will be completed, and the next
register number will be saved in the xPSR (Interrupt-
Continuable Instruction [ICI] bits).

• i.e. STMIA R8!, {R0-R6}


Contd.
• After the exception handler completes, the multiple
load/store/push/pop will resume from the point at which
the transfer stopped.

• If the multiple load/store/push/pop instruction being


interrupted is part of an IF-THEN (IT) instruction block,
the instruction will be canceled and restarted when the
interrupt is completed.

• This is because the ICI bits and IT execution status bits


share the same space in the Execution Program Status
Register (EPSR).
Programmer’s Model (Contd.)
• The Cortex-M3 and Cortex-M4 processors have two
operation states, two modes and two access levels.

• States: Thumb state & Debug state

• Modes: Thread Mode & Handler Mode

• Access levels : Privileged and Unprivileged (User)


Access Levels
• The privileged access level can allow access of all resources in
the processor.
• Unprivileged access level means some memory regions are
not accessible, and a few operations cannot be performed.
• The separation of privileged and unprivileged access
levels allows system designers to develop robust
embedded systems

• For example, a system can contain an


-Embedded OS kernel that executes in privileged access
level,
-Application tasks which execute in unprivileged
access level.
States
• Thumb state: If the processor is executing program code
(Thumb instructions), it is in the Thumb state. Unlike classic
ARM processors like ARM7TDMI, there is no ARM state
because the Cortex-M processors do not support the ARM
instruction set.

• Debug state: Cortex Processor implements debug technology


in hardware itself with several integrated components that
facilitate quicker debug with breakpoints, watchpoints etc. It
reduces time to market.
• When the processor is halted (e.g., by the debugger, or after
hitting a breakpoint), it enters debug state and stops executing
instructions.
Modes
• Thread mode: When executing normal
application code.

• Handler mode: When executing an exception


handler such as an Interrupt Service Routine
(ISR).

• By default, the Cortex-M processors start in


privileged, Thread mode and in Thumb state.
Contd.

Privileged level
Contd.
CONTROL register

• The CONTROL register defines:

• Access level in Thread mode


(Privileged/Unprivileged)

• The selection of stack pointer (Main Stack


Point/Process Stack Pointer)
Contd.
• The CONTROL register can only be modified in the
privileged access level and can be read in both
privileged and unprivileged access level. levels.
Contd.
• After reset, the CONTROL register is 0. This
means the Thread mode uses the Main Stack
Pointer (MSP) and has privileged accesses.

• Programs in privileged Thread mode can


switch the Stack Pointer selection or switch
to unprivileged access level by writing to
CONTROL Register.
Contd.
• However, once nPRIV bit is set (unprivileged
access level), the program running in Thread
mode can no longer access the CONTROL
register.

• A program in unprivileged access level


cannot switch itself back to privileged access
level. This is essential in order to provide a
basic security usage model.
Contd.
• For example, an embedded system might contain
untrusted applications running in unprivileged
access level and the access permission of these
applications must be restricted to prevent an
unreliable application from crashing the whole
system.

• If it is necessary to switch the processor back to


privileged access level in Thread mode, then the
exception mechanism is needed. The exception
handler can clear the nPRIV bit.
Contd.
Chapter 6. Overview of Memory System

• The Cortex-M processors have 32-bit memory addressing


and therefore have 4GB memory space. The memory
space is unified, which means instructions and data share
the same address space.

• Multiple bus interfaces to allow concurrent instructions


and data accesses (Harvard bus architecture)
Contd.
• Bus interface designs based on AMBA (Advanced
Microcontroller Bus Architecture), a de facto on-chip bus
standard: AHB (AMBA High-performance Bus) Lite
protocol for pipelined operations in memory

• Support both little endian and big endian memory systems


• Support for unaligned data transfers
• Bit addressable memory spaces (bit-band)
• Memory attributes and access permissions for
different memory regions
6.5 Memory Endianness
• The Cortex-M3 and Cortex-M4 processors support both
little endian and big endian memory systems.

• Endianness of the memory system is determined at a


system reset. Once it is set, the endianness of the
memory system cannot be changed until the next system
reset.
Contd.
• Most of the existing Cortex-M microcontrollers are little
endian. They have little endian memory system and
peripherals.
• With a little endian memory system, the first byte of
word-size data is stored in the least significant byte of
the 32-bit memory location
Contd.
• It is possible to design a Cortex-M3 or Cortex-M4
microcontroller with a big endian memory system. In
such case, the first byte of word-size data is stored in the
most significant byte of the 32-bit address memory
location
BE-8
• In the Cortex-M3 and Cortex-M4 processors, the big
endian scheme is called byte-invariant big endian, also
referred to as BE-8.
• Byte load from address X in little-endian mode accesses
the same data as a byte load from address X in
big-endian mode.
Contd.
• However, a word access in big-endian mode will return a
word whose bytes are in the opposite order to the same
word access in little-endian mode.
6.7 Bit-Band Operation
• Bit-band operation support allows a single load/store
operation to access (read/ write) to a single data bit.

• In Cortex-M3 and Cortex-M4 processors, this is


supported by two pre-defined memory regions called
bit-band regions.

• These two memory regions can be accessed via a


separate memory region called the bit-band alias.
Contd.
• Two memory regions for bit-band operations:
0x20000000 to 0x200FFFFF (SRAM, 1MB)
0x40000000 to 0x400FFFFF (Peripherals, 1MB)
• Every memory location in these regions is
bit-addressable.
• Cortex-M processors use the following terms.
• Bit-band region: This is a memory address region that
supports bit-band operation.
• Bit-band alias: Access to the bit-band alias (word
address) will cause an access to the bit-band region.
Bit accesses to bit-band region via
the bit-band alias (SRAM region)

0x2200007C
Contd.
• When the bit-band alias address is accessed, the address
is remapped into a bit-band address.

This means that each bit of real RAM is mapped to a word


address in the bit band alias region.
Contd.
• Bit banding works by creating an alias word address
(×4) for each bit of real memory or peripheral register
bit. So, the 1 MB of real SRAM is aliased to 32 MB of
virtual word addresses.
Address Calculation
• The calculation for the word address in the bit band
alias region is as follows:

• Bit band word address = bit band base + (byte offset


× 32) + (bit number × 4)

• bit band base is the starting address of the alias


region.(0x22000000-SRAM,0x42000000- Peripheral)
• Byte offset is the byte number in the bit-band region
that contains the targeted bit.
• Bit number is the bit position of the targeted bit.
Example-1
(i) Calculate the bit band word (alias) address of
bit [0] of byte in bit-band region at an address
0x20000000
= 0x22000000 + (0*32) + 0*4.

(ii) Calculate the bit band word (alias) address of


bit [7] of the bit-band byte at 0x20000000:
0x2200001C = 0x22000000 + (0*32) + 7*4.
Example-2

(iiii) Bit [0] of the bit-band byte at 0x200FFFFF:


0x23FFFFE0 = 0x22000000 + (0xFFFFF*32) + 0*4.

(iv) Bit [7] of the bit-band byte at 0x200FFFFF:


0x23FFFFFC = 0x22000000 + (0xFFFFF*32) + 7*4.
Bit accesses to bit-band region via
the bit-band alias (SRAM region)

0x2200007C
Keil Example
• To set bit 2 in word data in address 0x20000000, instead
of using three instructions to read the data, set the bit, and
then write back the result, this task can be carried out by a
single instruction
Read Operation
• Similarly, bit-band support can simplify
application code if we need to read a bit in a
memory location. For example, if we need to
determine bit 2 of address 0x20000000.
Contd.

When you access bit-band alias addresses, only the LSB


(bit[0]) in the data is used.
Example
• Address 0x20000000 = 0x3355AACC.
• (i) Read address 0x22000008. This read access is
remapped into read access to 0x20000000. The return
value is 1 (bit[2] of 0x3355AACC).
• (ii) Write 0x0 to 0x22000008. This write access is
remapped into a READ-MODIFY-WRITE to
0x20000000. The value 0x3355AACC is read from
memory, bit [2] is cleared, and a result of 0x3355AAC8
is written back to address 0x20000000.
• Now read 0x20000000. That gives you a return value of
0x3355AAC8 (bit[2] cleared).
6.7.2 Advantages
• Implement serial data transfers in general-purpose
input/output (GPIO) ports to serial devices
• Provides faster bit operations with fewer instructions.
• Bit-band operations also be used to simplify branch
decisions. if a branch should be carried out based on one
single bit in a status register
• Without bit-band operation
- Reading the whole register
- Masking the unwanted bits
- Comparing and branching
Contd.
• You can simplify the operations to:
Reading the status bit via the bit-band alias (get 0 or 1)
- Comparing and branching

• Important advantages of bit-band operation is atomic.


Since, It is carried out at hardware level, interrupts
cannot takes place between them.
• Without bit-band operation, for example, using a
software READ-MODIFY-WRITE sequence, the
following problem can occur:
• Consider a simple output port with bit 0 used by a main
program and bit 1 used by an interrupt handler.
6.7.4 Bit-band operations in C
programs
• There is no native support of bit-band operations in
C/C++ languages.

• C compilers do not understand that the same memory


can be accessed using two different addresses, and they
do not know that accesses to the bit-band alias will only
access the LSB of the memory location.

• To use the bit-band feature in C, the simplest solution is


to separately declare the bit-band address and the
bit-band alias of a memory location.
Contd.

# define DEVICE_REG0 *((volatile unsigned long *) (0x20000000))


# define DEVICE_REG0_BIT0 *((volatile unsigned long *) (0x22000000))
# define DEVICE_REG0_BIT1 *((volatile unsigned long *) (0x22000004))

DEVICE_REG0 = 0xAB; //Set-up data


DEVICE_REG0 = DEVICE_REG0 | 0x2; // Setting bit 1 without using
bit-band feature

DEVICE_REG0_BIT1 = 0x1; // Setting bit 1 using bit-band feature via


the bit-band alias address
Contd.
• Set up one Macro to convert bit-band address and bit
number into bit-band alias address:
#define BIT BAND (addr, bitnum) ((addr &
0xF0000000)+0x02000000+((addr & 0xFFFFF)<<5) + (bitnum
<<2))
• Set up another Macro to access memory location by taking
the address value as a pointer:

#define MEM_ADDR(addr) *((volatile unsigned long *) (addr))


Example
#define DEVICE_REG0 0x20000000

#define BIT BAND (addr, bitnum) ((addr & 0xF0000000) +


0x02000000 + ((addr & 0xFFFFF)<<5) + (bitnum<<2))

#define MEM_ADDR(addr) *((volatile unsigned long *) (addr))


.
MEM_ADDR(DEVICE_REG0) = 0xAB; // Accessing the register by normal
address

MEM_ADDR(BIT BAND(DEVICE_REG0,1)) = 0x1; //Setting bit 1 with using


bit-band feature
Contd.
• Variables being accessed might need to be declared as
volatile. The C compilers do not know that the same data
could be accessed in two different addresses, so the
volatile property is used to ensure that each time a
variable is accessed, the memory location is accessed
instead of a local copy of the data inside the processor.
6.9 Memory Access Attributes
• Bufferable: Write to memory can be carried out by a
write buffer while the processor continues on to next
instruction execution.

• Cacheable: Data obtained from memory read can be


copied to a memory cache so that next time it is
accessed the value can be obtained from the cache to
speed up program execution.
Contd.
• Executable: The processor can fetch and execute
program code from this memory region.

• Sharable: Data in this memory region could be shared by


multiple bus masters. The memory system needs to
ensure coherency of data between different bus masters in
the shareable memory region.
Bufferable Attribute
• The Bufferable attribute is used inside the processor. In
order to provide better performance, the Cortex-M3 and
Cortex-M4processors support a single entry write buffer
on the bus interface.

• A data write to a bufferable memory region can be


carried out in a single clock cycle and processor continue
to the next instruction execution, even if the actual
transfer needs several clock cycles to be completed.
Contd.
• This allows the CPU to make an entry into the write
buffer and continue on to the next instruction while
the write buffer completes the write to the real
SRAM. The write buffer performs the external write
in parallel.

• If however the write buffer is full then the processor is


stalled until there is sufficient space in the buffer is
available.

You might also like