QuadSPI STM
QuadSPI STM
QuadSPI STM
Application note
Quad-SPI interface on STM32 microcontrollers and
microprocessors
Introduction
In order to manage a wide range of multimedia, richer graphics and other data-intensive
content, embedded applications evolve to offer more sophisticated features. These
sophisticated features require extra demands on the often limited micocontroller (MCU) and
microprocessor (MPU) on-chip memory.
The STM32 MCUs and MPUs will be referred to as STM32 devices in this document. The
devices that are concerned are listed in Table 1: Applicable products
External parallel memories are used to extend the STM32 devices on-chip memory and
solve the memory size limitation. Usually this action compromises an increase in the pin
count and implies a more complex design.
To face these requirements, the STM32 devices embed an external memory interface
named Quad-SPI (see more details on Table 2 on page 9). This interface allows the
connection of external compact-footprint Quad-SPI high-speed memories.This Quad-SPI
interface is used for data storage such as images, icons, or for code execution.
This application note describes the Quad-SPI interface on the STM32 devices and explains
how to use the module to configure, program, and read external Quad-SPI memory. It
describes some typical use cases to use the Quad-SPI interface based on some software
examples from the STM32Cube firmware package and from the STM32F7 Series
application notes.
For additional more detailed information about the products listed in the table below, refer to
the corresponding datasheets and reference manuals available from the STMicroelectronics
web site www.st.com.
Contents
1 General information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1 QUADSPI availability and features across STM32 families . . . . . . . . . . . . 8
2.2 Quad-SPI benefits against classic SPI and parallel interfaces . . . . . . . . . 10
2.2.1 Main benefits of STM32 embedded Quad-SPI interface . . . . . . . . . . . . 10
2.3 QUADSPI in a smart architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
2.3.1 System architecture: STM32L4 Series . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 System architecture: STM32F4 Series . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.3 System architecture: STM32F7 Series . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.4 System architecture: STM32H7 Series . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.5 System architecture: STM32WB35xx and STM32WB55xx devices . . . 16
4 QUADSPI configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 GPIOs configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.1 GPIOs configuration using STM32CubeMX tool . . . . . . . . . . . . . . . . . . 40
4.2 QUADSPI peripheral configuration and clock . . . . . . . . . . . . . . . . . . . . . 43
4.2.1 QUADSPI peripheral configuration (QUADSPI_CR register) . . . . . . . . 43
4.2.2 Quad-SPI Flash memory parameters configuration
(QUADSPI_DCR register) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.3 QUADSPI and MPU configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.4 Quad-SPI memory device configuration . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.5 Starting a communication (QUADSPI_CCR register) . . . . . . . . . . . . . . 47
4.3 Hardware considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.1 Pull-up resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.2 Good PCB design allows maximum QUADSPI speed . . . . . . . . . . . . . 48
4.3.3 Chip-select high time (CSHT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.4 CKMODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.5 Some considerations when using QUADSPI in classical SPI mode . . . 49
8 Supported devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
10 Revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
List of tables
List of figures
Figure 47. Project configurations: executing code from Quad-SPI Flash memory . . . . . . . . . . . . . . . 68
Figure 48. Changing QUADSPI configuration in the project settings. . . . . . . . . . . . . . . . . . . . . . . . . . 69
Figure 49. Quad-SPI Flash memory connection in STM32756-EVAL board . . . . . . . . . . . . . . . . . . . . 70
Figure 50. 6_1-Quad-SPI_rwRAM-DTCM project configuration: code and data in
Quad-SPI memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Figure 51. 6_2-Quad-SPI_rwRAM-DTCM project configuration: only code in Quad-SPI memory . . . 76
Figure 52. Indirect write mode: programming Quad-SPI memory using DMA . . . . . . . . . . . . . . . . . . . 80
Figure 53. Indirect write mode: programming Quad-SPI memory using interrupt . . . . . . . . . . . . . . . . 82
Figure 54. Quad-SPI memory connection on the STM32F746G-DISCO discovery board . . . . . . . . . 86
Figure 55. Quad-SPI memory connection on the STM32L476G-EVAL board. . . . . . . . . . . . . . . . . . . 86
Figure 56. Deep power-down (DPD) sequence (command B9). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Figure 57. Release from deep power-down (RDP) sequence (command AB) . . . . . . . . . . . . . . . . . . 91
1 General information
2 Overview
The Quad-SPI is a serial interface that allows the communication on four data lines between
a host (STM32) and an external Quad-SPI memory. The QUADSPI supports the traditional
SPI (serial peripheral interface) as well as the dual-SPI mode which allows to communicate
on two lines. QUADSPI uses up to six lines in quad mode: one line for chip select, one line
for clock and four lines for data in and data out.
This interface is integrated on the STM32 devices to fit memory-hungry applications, to
simplify PCB (printed circuit board) designs and to reduce costs.
a. Arm is a registered trademark of Arm Limited (or its subsidiaries) in the US and/or elsewhere.
STM32F412 line
100 80
STM32F413/423 line(3)
STM32F446 line(4) 60
90
STM32F469/479 line
STM32F730xx devices
STM32F7x2 line(4)
STM32F750xx Yes 32
STM32F7x3 80
STM32F7x5 108
STM32F7x6
STM32F7x7
STM32F7x8
STM32F7x9
STM32H743/753
133 100
STM32H750 Value line
STM32L471xx
STM32L412xx
STM32L422xx 256 Mbytes 4 Gbytes
STM32L432xx
No
STM32L442xx
STM32L475xx
STM32L476xx
STM32L486xx
48
STM32L431xx 60 16
STM32L451xx
STM32L452xx
STM32L462xx Yes
STM32L4x3(5)
STM32L496xx
STM32L4A6xx
STM32WB35xx
50 No
STM32WB55xx
STM32L4R5/S5
STM32L4R7/S7 86 60 Yes
STM32L4R9/S9(6) 32
6. This set of products contains two Octo-SPI interfaces, each one of them can connect one or two Quad-SPI memories with
Single-Flash or Dual-Flash modes.
S-Bus
I-Bus
ICode
ACCEL
Flash
Masters accessing
DCode memory QUADSPI
Quad-SPI interface
SRAM1
32-bit AHB bus
4 4 3
FMC
4 4 Registers
1
Memory-mapped region
1 For STM32L471xx, STM32L475xx, STM32L476xx and STM32L486xx devices, QUADSPI and FMC share the same AHB bus on the bus matrix
2 DMA2D is only available on STM32L496xx and STM32L4A6xx devices
3 FMC is available only on STM32L47xxx and STM32L4x6xx devices
4 When remapped
OTG HS
64-Kbyte 2
Chrom-
DMA1
DMA2
LDTC
MAC
USB
GP
GP
S-bus
DMA_P2
DMA_PI
I-bus
DMA_MEM1
DMA_MEM2
Flash
ART
1 For STM32F412, STM32F413/423 and STM32F446 lines, QUADSPI and FMC share the same AHB bus on the bus matrix
2 Available only on STM32F469/479 line devices
3 USB OTG HS is available only in the STM32F446 and STM32F469/479 lines
DMA2D
DMA2
LDTC
USB
MAC
GP
GP
DMA_MEM2
DMA_P2
AXIM AHBP
DMA_P1
APB2
ART
Flash
memory
64-bit AHB
64-bit bus matrix Masters accessing
QUADSPI
SRAM1
Quad-SPI interface
SRAM2
32-bit AHB bus
AHB1 peripheral
64-bit AHB bus
AHB2 peripheral
FMC QUADSPI registers
access
Registers
Memory-mapped region QUADSPI
memory-mapped
region access
32-bit bus matrix-S
Bus mutliplexer
1 Mac Ethernet , LCD-TFT and DMA2D are not available on STM32F72xxx and STM32F73xxx devices.
SDMMC2
USB HS1
USB HS2
Ethernet
MAC
Cortex-M7 ITCM
D$ DMA1 DMA2
I$
DTCM
DMA1_PERIPH
DMA2_PERIPH
AHBP
DMA1_MEM
DMA2_MEM
SDMMC1 MDMA DMA2D LTDC
D1-to-D2 AHB
APB3 SRAM1
Registers
AHB3 SRAM2
FLASH A SRAM3
1
FLASH B AHB2
FMC APB2
D1 domain
D2-to-D3 AHB
BDMA
Quad-SPI interface MDMA
1
Mac Ethernet , LCD-TFT and DMA2D are not available on STM32F72x and STM32F73x devices
CPU1 CPU2
DMA1
DMA2
Radio
I-bus
S-bus
ICode
DCode CFI Flash
SCode arbiter memory
SRAM1
SRAM2
AHB1 peripheral
AHB2 peripheral
Registers
1 1
Memory-mapped region
AHB4
AHB5
Bus matrix
1 When remapped
nCS
SCLK
IO 0 4 0 4 0 4 0 4 0 4 0 4 0 4 0
IO 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1
IO 2 6 2 6 2 6 2 6 2 6 2 6 2 6 2
IO 3 7 3 7 3 7 3 7 3 7 3 7 3 7 3
A23-16 A15-8 A7-0 M7-0 Byte 1 Byte 2
Only 2 cycles
command IO switch from
output to input
MSv41107V1
nCS
CLK
Instruction on 1 BK1_IO0 7 6 5 4 3 2 1 0 output
line: Single SPI IMODE[1:0] = 01
mode BK1_IO1 High-Z
BK1_IO2 High-Z
BK1_IO3 High-Z
MSv41108V1
nCS
BK1_IO2 High-Z
BK1_IO3 High-Z
MSv41109V1
nCS
CLK
Instruction on 4
BK1_IO0 4 0 Output
lines: Quad-SPI IMODE[1:0] = 11
mode BK1_IO1 5 1 Output
BK1_IO2 6 2 Output
Address is given
directly via the AHB
Address to be sent QUADSPI_AR[31:0] ADDRESS[31:0] from any master on
the bus matrix like
Cortex® or DMA
1-byte ADSIZE[1:0] =00
Note: In Dual-Flash memory mode when DFM = 1 the address to be sent to Flash1 is exactly the
same address to be sent to Flash2.
Depending on the software and the hardware configurations, the alternate byte can be sent
over one, two or four lines. If not needed, the alternate-byte phase can be skipped.
The table below summarizes different alternate-byte phase configurations.
Note: In Dual-Flash memory mode when DFM = 1, the alternate-byte to be sent to Flash1 is
exactly the same as the ones to be sent to Flash2.
Alternate-byte
nCS
Nibble to be sent 0010
CLK
BK1_IO0 0 0 Output
BK1_IO1 0 1 Output
BK1_IO2 0 0 nWP
BK1_IO3 1 1 nHOLD
MSv41111V1
Figure 8. Dummy-cycle: IO2 maintained low and IO3 maintained high by hardware
2 Dummy cycles
nCS
CLK
BK1_IO0 High-Z
BK1_IO1 High-Z
BK1_IO2 0 nWP
BK1_IO3 1 nHOLD
MSv41160V1
Depending on the software and the hardware configurations, the data transfer can be done
in one, two or four lines. In some use cases where data is not needed such as erasing
operation, the data phase can be skipped.
The following table summarizes the data phase configuration in different functional modes.
CLK
Bank1 Bank2 BK1_IO0/SO
CLK CLK BK1_IO1/SI
Single/Dual SPI Used GPIOs BK1_IO0/SO BK2_IO0/SO BK2_IO0/SO
mode BK1_IO1/SI BK2_IO1/SI BK2_IO1/SI
BK1_nCS BK2_nCS BK1_nCS(1)
BK2_nCS(1)
GPIOs number 4 GPIOs 6 or 7 GPIOs
CLK
BK1_IO0/SO
Bank1 Bank2 BK1_IO1/SI
CLK CLK BK1_IO2
BK1_IO0/SO BK2_IO0/SO BK1_IO3
Used GPIOs BK1_IO1/SI BK2_IO1/SI BK2_IO0/SO
Quad-SPI mode
BK1_IO2 BK2_IO2 BK2_IO1/SI
BK1_IO3 BK2_IO3 BK2_IO2
BK1_nCS BK2_nCS BK2_IO3
BK1_nCS(1)
BK2_nCS(1)
GPIOs number 6 GPIOs 10 or 11 GPIOs
1. In Dual-Flash mode it is possible to use one chip select, either BK1_nCS or BK2_nCS. For more details on
dual-flash mode, refer to Section 3.2.4: Dual-Flash memory mode on page 25.
Note: If none of the phases are configured to use Quad-SPI mode, then the GPIOs corresponding
to IO2 and IO3 can be used for other functions even while QUADSPI is active.
Control
CLK CLK Flash communication lines
BK1_IO1/SI Q1/SO
QUADSPI BK1_IO2 Q2/nWP
Communication lines
BK1_nCS nCS
MSv41113V1
BK1_nCS nCS
The following figure shows an example of a read sequence in Dual-Flash memory Quad I/O
SDR mode.
Figure 12. Read sequence in dual-Flash memory Quad I/O SDR mode
nCS_BK1
nCS_BK2
SCLK
BK1_IO0 4 0 20 16 12 8 4 0 4 0 4 0 4 0
Flash 1
BK1_IO1 5 1 21 17 13 9 5 1 5 1 5 1 5 1
Bytes at even
address Byte 1 Byte 3
BK1_IO2 6 2 22 18 14 10 6 2 6 2 2 2
6 6
BK1_IO3 7 3 23 19 15 11 7 3 7 3 7 3 7 3
BK2_IO0 4 0 20 16 12 8 4 0 4 0 4 0 4 0
Flash 2
BK2_IO1 5 1 21 17 13 9 5 1 5 1 5 1 5 1
Bytes at odd
Byte 2 Byte 4
address
BK2_IO2 6 2 22 18 14 10 6 2 6 2 6 2 6 2
BK2_IO3 7 3 23 19 15 11 7 3 7 3 7 3 7 3
IO switch from
output to input
MSv41115V1
Note that all bytes at even addresses are stored in Flash 1 while all bytes at odd addresses
are stored in Flash 2. As described in Figure 12, in dual-Flash mode the same command,
address and alternate are sent to both Flash 1 and Flash 2. For example to read the first
four bytes in dual-Flash memory-mapped mode from 0x90000 000 to 0x9000 0003 the
following sequence is done by QUADSPI peripheral:
• The address 0x0000 0000 is sent to both Flashes and Byte 1 (at even address
0x9000 0000) is read from Flash 1 while Byte 2 (at odd address 0x9000 0001) is read
from Flash 2.
• Then the address 0x0000 0001 is sent to both Flashes and Byte 3 (at even address
0x9000 0002) is read from Flash 1 while Byte 2 (at odd address 0x9000 0003) is read
from Flash 2.
Cautions:
• In Dual-Flash memory mode both device models must be identical, because in this
mode the same commands and addresses are issued in parallel to both Flash
memories; this permits to double the available Quad-SPI external Flash size. In the
case that the two Flash-memory devices are different, the Dual-Flash mode must be
disabled (DFM = 0) and each Flash memory could be used in standalone, allowing
either Flash 1 or Flash 2 to be enabled using QUADSPI_CR[7] FSEL bit.
• For all hardware configurations listed in the table below, each memory device is
configured in Quad-SPI mode. It is possible to connect each device in Single or Dual-
SPI mode. If DFM = 1, both devices must be configured in the same way. This permits
to double the available external data size and throughput.
• The Flash memory size, as specified in FSIZE[4:0] (QUADSPI_DCR[20:16]) should
reflect the total Flash memory capacity, which is the double of the size of one individual
component.
FSEL = 0
Flash 1 CLK CLK Flash 1
enabled BK1_IO0/SO Q0/SI
Single Flash
DFM = 0(1) FSEL = 1
BK1_IO1/SI Q1/SO
BK1_IO2 Q2/nWP
Flash 2
Q3/ nHOLD
Both nCS_BK1 enabled BK1_IO3
MSv41116V1
BK1_IO1/SI Q1/SO
BK1_IO2 Q2/nWP
nCS_BK1 BK1_nCS
Dual-Flash memory DFM nCS
connected to QUADSPI
=1
both devices CLK Flash 2
BK2_IO0/SO Q0/SI
BK2_IO1/SI Q1/SO
BK2_IO2 Q2/nWP
BK2_IO3 Q3/ nHOLD
nCS
MSv41118V1
1 nCS
enabled
CLK CLK Flash 1
BK1_IO0/SO Q0/SI
BK1_IO1/SI Q1/SO
BK1_IO2 Q2/nWP
nCS_BK2 nCS
Dual-Flash memory DFM
connected to QUADSPI
=1
both devices CLK Flash 2
BK2_IO0/SO Q0/SI
BK2_IO1/SI Q1/SO
BK2_IO2 Q2/nWP
BK2_IO3 Q3/ nHOLD
BK2_nCS nCS
MSv41119V1
1. When single-Flash memory mode is selected DFM = 0, the user can switch between Flash 1 or Flash 2 using FSEL bit.
Pink lines highlight the used chip select.
The Quad-SPI interface is able to manage up to 256 Mbytes memory starting from
0x9000 0000 to 0x9FFF FFFF in the Memory-mapped mode.
nCS
Command Address Dummy Byte1 Byte2 ….. ByteN Command Address Dummy Byte1 Byte2 …..
MSv41161V1
Figure 14. Executing non-sequential code from QUADSPI with SIOO enabled
First read operation Jump Second read operation Jump Third read operation
nCS
MSv41162V1
The SIOO feature is supported by many Quad-SPI memory manufacturers such as Micron,
Spansion and Macronix, nevertheless before using it, the user has to check if the feature is
supported by the used memory.
To enable the SIOO mode, the user should:
• Configure the memory by entering the SIOO mode. Refer to relevant manufacturer’s
datasheet for more details on how to enter this mode (make sure that the read
command to be used does support this mode). Note that an alternate byte (mode Bits)
needs to be sent in order to keep the device in this mode. Refer to SIOO example on
Section 6.2 on page 67 for more details on enabling this feature.
• Configure the QUADSPI peripheral by setting the SIOO bit in QUADSPI_CCR register.
For more details on QUADSPI timing characteristics refer to the relevant products
datasheet.
The following table summarizes different cases when the BUSY bit is reset in different
QUADSPI operating modes:
– The QUADSPI has completed the requested command sequence and the
Indirect mode FIFO is empty
– Due to an abort
Automatic-polling – After the last periodic access is complete, due to a match when APMS =1
mode – Due to an abort
– On a timeout event
Memory-mapped
– Due to an abort
mode
– QUADSPI peripheral is disabled
ABORT bit
When an application is running, any ongoing QUADSPI operation can be aborted by setting
the ABORT bit in the QUADSPI_CR register. Once the abort is completed, the BUSY bit and
the ABORT bit are automatically reset and the FIFO is flushed. If an abort occurs on an
ongoing AXI/AHB burst operation, the QUADSPI allows the ongoing burst to complete
properly before reseting the BUSY bit and the ABORT bit.
Note: Some Flash memories might misbehave if a write operation to a status registers is aborted.
QUADSPI
DLYB
dlyb_out_ck dlyb_in_ck
Registers access
over AHB3
dlyb_in_ck : delay block input clock
dlyb_out_ck : delay block output clock MSv61188V1
In indirect mode when configuring the DMA for data transfer from/to the QUADSPI, the
QUADSPI should be considered as a peripheral:
• Memory to peripheral mode in case of writing data to the QUADSPI from the internal
memory
• Peripheral to memory mode in case of reading data from the QUADSPI to be
transfered into the internal memory.
Also the address of the QUADSPI should be written into the peripheral address register
(DMA channel/stream x peripheral address).
The table below summarizes the different DMA requests and transfer directions versus the
STM32 series.
Table 14. DMA requests mapping and transfer directions versus STM32 series
Product (1) DMA1 DMA2 MDMA
Request 5 Request 3
STM32L4 Series NA
Channel 5 Channel 7
Stream 7
STM32F4 Series NA NA
Channel 3
Stream 7
STM32F7 Series NA NA
Channel 3
quadspi_ft_trg
NA NA channel
X[0..15]/Stream22
STM32H7 Series
quadspi_tc_trg
NA NA channel
X[0..15]/Stream23
1. For applicable devices of each series embedding a QUADSPI.
The DMAEN bit has no effect in Memory-mapped mode, the transfer is started as soon as
the DMA is accessing the QUADSPI address range (from 0x9000 0000 to 0x9FFF FFFF).
Once the DMA configured transfer is started by software, the DMA reads the data from the
Quad-SPI memory exactly as an internal memory. The QUADSPI peripheral manages the
communication with the external memory and puts the read data in the FIFO.
The number of data items to be transferred is managed by the DMA so the user should
configure the number of data in the DMA’s register DMA_SxNDTR (or DMA_CNDTRx
register for STM32L4x6xx). There is no need to configure the QUADSPI_DLR register as it
has no effect in the Memory-mapped mode where the DMA is the flow controller.
Note: The DMA’s FIFO can be used for example if the DMA Burst mode is required to reduce the
transfer overhead on the bus matrix.
ITCM
Cortex-M7
I$ D$
DTCM
Masters accessing
QUADSPI
Slaves
MDMA
QUADSPI
quadspi_ft_trg registers access
Channel X[0..15]/Stream22
quadspi_tc_trg QUADSPI
Channel X[0..15]/Stream23 memory-mapped
region access
AXI
32-bit AHB QUADSPI registers access AHB
TCM
64-bit AXI QUADSPI memory-mapped access
MSv61185v1
4 QUADSPI configuration
This section describes all QUADSPI configuration steps required to perform either read,
write or erase operations.
MSv61189V1
If after selecting one hardware configuration (as shown in Figure 17) the used GPIOs does
not match with the memory connection board, the user can configure the alternate function
directly on the corresponding pins.
For more details on QUADSPI alternate functions availability versus GPIOs, refer to the
alternate function mapping table in the relevant datasheet.
The figure below shows how to configure manually a PF8 pin to QUADSPI_BK1_IO0
alternate function.
The used pins are highlighted in green once the GPIOs of the Quad-SPI interface are
correctly configured.
MSv61190V1
MSv61191V1
MSv41194V1
In the STM32H7 Series devices the QUADSPI contains two different source clocks:
• quadspi_ker_ck
It is the source clock to generate QUADSPI CLK using the following relation
(QSPI_CLK=quadspi_ker_ck/(Prescaler + 1).
• quadspi_hclk (hclk3)
It is the source clock for the register interface. This clock has no impact on the
QUADSPI CLK.
STM32CubeMx permits the configuration of quadspi_ker_ck source clock in the clock
configuration section.
The following figure shows the multiple source clocks for quadspi_ker_ck using
STM32CubeMx.
MSv61192V1
The source clock for quadspi_ker_ck can be selected by using the QUADSPI clock mux as
shown in the following figure.
MSv61193V1
QUADSPI_CR
QUADSPI_DCR
QUADSPI_CR
MSv61194V1
4.3.4 CKMODE
The clock mode indicates the level that CLK takes between commands when nCS is high.
Two modes are supported when nCS is high: mode 0 where CLK stays low and mode 3
where CLK stays high.
Figure 27. Chip select high time: CSHT = two clock cycles
CSHT
nCS
SCLK CLKMODE = 0
BK1_IO0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
BK1_IO1
BK1_IO2
BK1_IO3
Mode 0 SCLK
Mode 3 SCLK
nCS
SI MSB
SO MSB
MSv41196V1
This section describes how to program a Quad-SPI Flash memory in the following use
cases:
• For an end application development: in this case the Quad-SPI memory is
programmed during the development of the product with static data or code to be used
in the final product. A dedicated Flash memory loader is needed in order to place the
data or the code to be used in the application. The Flash loaders provided by ST can
be used for programming if the user is using one of the ST EVAL or Discovery boards,
otherwise the user should develop its own Flash memory loader.
• On-the-fly when application is running: in this case the Quad-SPI Flash memory is
used in a final product as an external mass-storage device, this permits the application
to store data any time that it is needed.
Note: For both cases the programming principle is the same. The only difference is that in the first
case, the programming operation is performed with a tool and a Flash memory loader during
the application’s development, while in the second case the programming operation is
performed during a running application in a final product. Only the Indirect mode should be
used for programming regardless if it is a writing or an erasing operation.
Depending on the used Flash memory brand, different programming commands are
available, so it is up to the user to configure the desired command supported by the device.
The instruction, address and data phases can be sent in one, two or four lines for command
phase depending on the device brand.
The 4-byte address mode can be used to program the Quad-SPI Flashes with sizes up to
4 Gbytes.
The Automatic-polling mode can be used for waiting while the programming operation is
ongoing; when the operation is completed an interrupt can be generated.
.BIN
.HEX
QUADSPI
Quad-SPI Debug
Flash STM32 STLink
MSv41121V2
How to create a new Quad-SPI Flash memory loader and add it to the ST-LINK
utility
For each hardware configuration and for each Quad-SPI Flash memory brand, a dedicated
Flash memory loader should be developed. The user has to develop its own dedicated
Flash memory loader (.stldr file) if the hardware used is other than ST boards.
A project is provided in the ST-LINK utility install directory “STMicroelectronics\STM32 ST-
LINK Utility\ST-LINK Utility\ExternalLoader\N25Q256A_STM32L476G-EVAL_Cube”
allowing the user to develop an external loader for a N25Q256A Flash memory on the
STM32L476G-EVAL board. This project can be easily tailored to the user dedicated
hardware to generate the external loader.
For more details on how to develop an external Quad-SPI Flash memory loader for the
STM32 ST-LINK utility, refer to the user manual STM32 ST-LINK Utility software description
(UM0892), section “Developing custom loaders for external memory” available at
www.st.com.
Caution: The tool chain/compiler used to generate the HEX/BIN file to program the Quad-SPI
memory must be exactly the same as the one used for the application development.
The dedicated Flash-memory loader has to be added to the ST-LINK utility in order to be
able to program a Quad-SPI Flash memory.
Figure 30. STM32 ST-LINK utility: adding Quad-SPI Flash memory loader
A window appears where the user should select their device. See an example below:
Figure 31. STM32 ST-LINK utility: selecting Quad-SPI Flash memory loader
Note: Only one external loader can be added, otherwise the error message in the figure below
appears. The user can remove one external loader and replace it with another one if
needed.
3. The following window appears allowing the user to browse to the data file to be stored
in the Flash memory, which can be a binary file, an HEX file or Motorola S-record files
(.srec or .s19).
Figure 34. STM32 ST-LINK utility: selecting HEX file for programming
Figure 36. Adding Quad-SPI Flash memory loader to Keil MDK-ARM project
The Quad-SPI Flash memory loader is then added in the following window.
Figure 37. Adding Quad-SPI Flash memory loader to Keil MDK-ARM project
In the following window, select from the list the corresponding Flash memory loader:
Once the corresponding Flash memory loader is added it appears in the programming
algorithm list as shown in the figure below.
Once the Quad-SPI Flash memory loader is added, programming can be done by clicking
on the Load button or pressing the F8 key on the keyboard.
Note: The region to be programmed is defined by default in the external loader and can be
changed by changing the start address and the size fields.
Either CPU with interrupts or DMA can be used for programming Quad-SPI memory as
follow.
1. Programming Quad-SPI memory using Indirect-write mode
When using Indirect-write mode, all programming operations are handled by software by
writing directly to the QUADSPI_DR register. An interrupt is generated when a transfer
complete is identified or if FIFO threshold is reached.
2. Programming Quad-SPI memory using Indirect-write mode with DMA
It is generally recommended to use DMA to program the Quad-SPI memory using Indirect-
write mode since it offloads the CPU, nevertheless the final recommendation depends on
the user application. In some cases, where the amount of data to be written to the memory
is relatively small, there is no need to use DMA. Once the DMA is configured and the
programming operation has started, no intervention from the CPU is needed and the
operation ends autonomously. For more details on DMA usage, refer to Section 3.5.2: DMA
usage.
3. Usage of Status-polling mode
The user can use this mode to poll the memory status register. The figure below shows an
example of a status-register reading sequence.
Most of the Flash memory devices support a sector erasing and a full-chip erasing operation
and some of them support an additional erasing operation offering more flexibility to users
applications. Refer to the manufacturer’s datasheet for more details on the supported
erasing operations.
Note: If the used memory size is larger than 16 Mbytes, 4-bytes address mode have to be used, in
this case the user should choose the 4-byte command from the memory datasheet.
Sector-erase sequence
To erase a sector on the memory, a sector-erase command and a starting sector address
should be sent.
Example: to perform a sector-erase operation on the MICRON N25Q512A memory, the
QUADSPI_CCR register should be configured as below:
QUADSPI->CCR = 0x000025D8; /* Instruction= 0xD8; IMODE = 0x01; ADMODE=
0x01; ADSIZE = 0x02 */
QUADSPI->AR = 0x00000000; /* Address 0x00000000 is sent to erase the first
sector */
See below an example of a sector-erasing sequence:
This section provides some typical QUADSPI use case examples showing how to use the
interface in Indirect-mode, Status-flag polling mode and Memory-mapped mode.
Some of these examples are provided in the STM32Cube firmware package while others
are based on other application notes also available on the ST website. Some hardware
implementation examples are provided as well at the end of this section.
This section describes the following use cases:
• Memory-mapped mode: reading data in a graphical application
• Memory-mapped mode: executing code from the Quad-SPI Flash memory
• Indirect mode: storing data on-the-fly during a running application
• Indirect mode: erasing data
• Hardware implementation example
LCD-TFT
RGB
LTDC
SDRAM
frame buffer
Chrom-ART
FMC
DMA2D
QUADSPI
16 MB Quad-SPI Flash
Figure 45. DMA2D reading images from Quad-SPI to build frame buffer content
DTCM
ITCM
AHBS
1 1 1
OTG HS
Ethernet
DMA2D
Arm Cortex-M7
DMA1
DMA2
LDTC
USB
MAC
GP
GP
L1-cache DTCM RAM
DMA_MEM1 ITCM RAM
DMA_MEM2
DMA_P2
AXIM AHBP
DMA_P1
ITCM
APB2
Masters accessing
QUADSPI
ART
Slaves
Flash
memory Quad-SPI interface
64-bit AHB
32-bit AHB bus
64-bit bus matrix
64-bit AHB bus
SRAM1 QUADSPI registers
access
SRAM2 QUADSPI memory-
AHB1 peripheral mapped region access
Bus mutliplexer
AHB2 peripheral
LTDC fetches data from
FMC framebuffer to display it
on the LCD
Registers DMA2D transfers data
from Quad-SPI memory
Memory-mapped region to SDRAM
In parallel CPU fetches
32-bit bus matrix-S instructions from Flash
AXIM
1 Mac Ethernet , LCD-TFT and DMA2D are not available on STM32F72xxx and STM32F73xxx devices.
Quad-SPI interface
DMA2D
LTDC
Arm Cortex-M7 DTCM RAM
L1-cache ITCM RAM Slave
ART
Flash
AXI to muti AHB LTDC reading
memory image from Quad-
SPI memory
Once the QUADSPI is configured, to display the image stored in the Quad-SPI memory, the
following API is called in the LCDConf.c file: HAL_LTDC_ConfigLayer(&hltdc_F,
&pLayerCfg, 0).
pLayerCfg is the pointer to a LTDC_LayerCfgTypeDef structure that contains the address of
the image in the Quad-SPI memory.
The figure below highlights the two project configurations in Keil MDK-ARM that are
described in this document.
Figure 47. Project configurations: executing code from Quad-SPI Flash memory
Projects configuration
For both project configurations, 6_1-QuadSPI_rwRAM-DTCM and 6_2-QuadSPI_rwRAM-
DTCM, the user can change the desired QUADSPI settings in the Options for Target box as
shown in the next figure.
Note that the operating system clock during system initialization is 16 MHz, so at this
moment the QSPI_CLK = fAHB/1 = 16 MHz (by default the prescaler = 0). Once the system
initialization is done, the CPU jumps to the main function (in arm_fft_bin_example_f32.c file)
where the system clock configuration is performed. The system clock is configured to run at
216 MHz.
Both project configurations have the following QUADSPI settings:
• QSPI_CLOCK_PRESCALER = 3
• System clock is 216 MHz => QSPI_CLK = 54 MHz
• QSPI_DDRMODE => DDR mode enabled
• QSPI_INSTRUCTION_1_LINE => instruction is issued in one line
• QSPI_XIP_MODE => execute in place with SIOO enabled.
GPIOs configuration
As shown in the following figure, the Quad-SPI Flash memory is connected in Quad I/O
mode, so six GPIOs have to be configured for the Quad-SPI interface.
/* Connect PF6, PF7, PF8 and PF9 pins to Quad-SPI Alternate function */
GPIOF->AFR[0] |= 0x99000000;
GPIOF->AFR[1] |= 0x000000AA;
/* Configure PFx pins in Alternate function mode */
GPIOF->MODER |= 0x000AA000;
/* Configure PFx pins speed to 100 MHz */
GPIOF->OSPEEDR |= 0x000FF000;
/* Configure PFx pins Output type to push-pull */
GPIOF->OTYPER = 0x00000000;
/* No pull-up, no pull-down for PFx pins */
GPIOF->PUPDR = 0x00000000;
DTCM
ITCM
AHBS
OTG HS
Ethernet
DMA2D
Arm Cortex-M7
DMA1
DMA2
LDTC
USB
MAC
GP
GP
DMA_MEM2
DMA_P2
AXIM AHBP
ITCM
Masters accessing
QUADSPI
All remaining project’s
ART code + data Slaves
Quad-SPI interface
Flash
memory 32-bit AHB bus
64-bit AHB
64-bit bus matrix 64-bit AHB bus
QUADSPI registers
SRAM access
QUADSPI memory-
Quad-SPI Flash
mapped region access
Bus mutliplexer
Application’s code
+ constant data Code execution from
QUADSPI with L1-Cache
To place code and constant data in the Quad-SPI memory a dedicated load region has to be
created as shown in the following Keil MDK-ARM scatter file:
; *************************************************************
; *** Scatter-Loading Description File generated by uVision ***
; *************************************************************
LR_IROM1 0x00200000 0x00100000 { ; load region size_region
ER_IROM1 0x00200000 0x00100000 { ; load address = execution address
*.o (RESET, +First)
*(InRoot$$Sections)
; Place all remaining code and const data in Flash TCM.
.ANY (+RO)
}
}
Code placed in Quad-SPI memory while constant data in Flash memory ITCM:
6_2-Quad-SPI_rwRAM-DTCM
In this project configuration, the application code is placed in the Quad-SPI memory while its
related constant data is placed in the Flash ITCM. The Cortex®-M7 have to fetch code from
the Quad-SPI memory and data from the Flash ITCM.
All remaining project codes as the peripheral drivers and the vector tables are placed in the
Flash memory ITCM. The figure below describes the 6_2-Quad-SPI_rwRAM-DTCM project
configuration.
DTCM
ITCM
AHBS
OTG HS
Ethernet
DMA2D
Arm Cortex-M7 DMA1
DMA2
LDTC
USB
MAC
GP
GP
L1-cache DTCM RAM
ITCM RAM
DMA_MEM1
DMA_MEM2
AXIM AHBP DMA_P2
* All remaining
ITCM
Masters accessing
project’s code + data
QUADSPI
* Application’s
constant data Slaves
ART
Quad-SPI interface
Flash
memory 32-bit AHB bus
64-bit AHB
64-bit bus matrix 64-bit AHB bus
QUADSPI registers
SRAM access
QUADSPI memory-
Quad-SPI Flash
mapped region access
Bus mutliplexer
Application’s code Code execution from
QUADSPI with L1-Cache
To place code and constant data in the Quad-SPI memory a dedicated load region has to be
created as shown in the following Keil MDK-ARM scatter file:
; *************************************************************
; *** Scatter-Loading Description File generated by uVision ***
; *************************************************************
Performances analysis
The results are obtained with the STM32756G-EVAL, the CPU is running at 216 MHz,
VDD=3.3 V and with seven wait-states access to the internal Flash memory. The QUADSPI
is configured in DDR 1-4-4 mode with SIOO enabled and QSPI_CLK = 54 MHz.
The table below shows the obtained results for FFT demonstration for MDK-ARM in each
configuration.
- 5-RAMITCM_rwRAM-DTCM 112428
I-cache + D-cache ON (constant data in Quad-SPI memory) 6_1-Quad-SPI_rwRAM-DTCM 171056
I-cache + ART + ART-PF ON (constant data in Flash TCM) 6_2-Quad-SPI_rwRAM-DTCM 126900
1. The number of cycles may change from a version to another of the tool chain.
If the results of the case one and the case two of the “6-Quad SPI_rwRAM-DTCM”
configuration are compared, we note that there is a significant difference in terms of
performance since the demonstration uses a huge constant data.
• For the case one (6_1-Quad SPI_rwRAM-DTCM), since the read-only data and the
instructions are both located in the Quad-SPI Flash memory, a latency occurs due to
the concurrency access of the instruction fetch and the read-only data on the Quad-SPI
interface.
• For the case two (6_2-Quad SPI_rwRAM-DTCM), the read-only data and code are
separated. The read-only data is located in Flash-TCM, therefore, the concurrency of
the read-only data and the instruction fetch is avoided and the CPU can fetch the
instruction from AXI while the data is loaded from TCM at the same time. This is the
reason why the performance of the second case is clearly better than the first one.
By comparing the case 6_2-Quad-QPI_rwRAM-DTCM with the 5-RAMITCM_rwRAM-DTCM
(which gives the best performances at 112428 CPU cycles as per AN4667 document), it is
seen that it is they are close in terms of performances.
This is an example of how important it is to benefit from the STM32F7x5/F7x6 smart
architecture (in this example) in order to improve the execution performances from the
external Quad-SPI memory. For more details on how to improve the execution
performances from Quad-SPI memory, refer to Section 7.1: How to get the best
performances.
As described in the figure below, the DMA reads the data “aTxBuffer” from the SRAM and
writes it to the Quad-SPI memory, in the meanwhile the CPU can execute code from the
internal Flash.
Figure 52. Indirect write mode: programming Quad-SPI memory using DMA
DTCM
ITCM
AHBS
OTG HS
Ethernet
DMA2D
Arm Cortex-M7
DMA1
DMA2
LDTC
USB
MAC
GP
GP
DMA_MEM2
DMA_P2
AXIM AHBP
DMA_P1
APB2
Masters accessing
ART
QUADSPI
Flash
Quad-SPI interface
memory
64-bit AHB
64-bit bus matrix Slave
Figure 53. Indirect write mode: programming Quad-SPI memory using interrupt
DTCM
ITCM
AHBS
OTG HS
Ethernet
DMA2D
DMA1
LDTC
DMA2
USB
MAC
Arm Cortex-M7
GP
GP
DMA_MEM2
DMA_P2
AXIM AHBP
DMA_P1
APB2
Masters accessing
ART
QUADSPI
Flash
Quad-SPI interface
memory
64-bit AHB
64-bit bus matrix Slave
STM32L476G-EVAL board
The figure below shows an example of how to connect MICRON Quad-SPI Flash memory in
Quad I/O mode on the STM32L476G-EVAL discovery board.
This section presents some recommendations on how to improve performance and how to
decrease power consumption for applications using the Quad-SPI interface.
• Use SIOO feature (Continuous read mode) for random and non-sequential accesses
For random and non-sequential accesses, the command overhead increases. As described
in Figure 13, a command and an address are sent to the memory every new read sequence.
In this case, the user should enable the SIOO feature in order to reduce the command
overhead (see Section 3.4.1: Send instruction only-once (SIOO)).
Note: Not all the read commands support the Continuous read mode (Enhance performance
mode) so the user should consider this information when selecting the read command.
Execution performance
To improve the execution performance from the Quad-SPI Flash memory, the user should
follow the previously described read performance recommendations.
Executing from the Quad-SPI memory is generally characterized by its random and non-
sequential accesses. As already mentioned, an important recommendation to boost
execution performance is enabling the SIOO feature.
As seen in Figure 50: 6_1-Quad-SPI_rwRAM-DTCM project configuration: code and data in
Quad-SPI memory, placing both the code and the read-only data in the Quad-SPI memory
leads to concurrency on the Quad-SPI interface during execution.
In order to avoid this concurrency, the user can separate the read-only data and the code.
For example, the read-only data can be located in the Quad-SPI memory while the code can
be located in the Flash-TCM. This action permits to avoid the concurrency of the read-only
data and the instruction fetch; therefore, the CPU can fetch the instructions from Flash-TCM
while the data is loaded from the Quad-SPI memory at the same time.
When the application contains huge constants data, the user can separate constants and
code, each one in a dedicated section. If the code section size fit the internal Flash memory
size, the code can be loaded in the internal Flash memory while the constants are loaded in
the Quad-SPI memory.
For the STM32F7x5/F7x6, it is recommended to enable the Cortex®-M7 L1-Cache.
General recommendations
• Use DMA for data transfers in order to offload the CPU.
• Use Flag-status polling mode rather than software flag checking.
• Use Memory-mapped mode to permit any AHB master to access the Quad-SPI
memory without CPU intervention.
To exit DPD mode, the RELEASE FROM DEEP POWERDOWN command (0xAB) should
be sent. The figure below shows the RPD sequence:
Figure 57. Release from deep power-down (RDP) sequence (command AB)
Note: Not all the serial Flash memories support the Deep power-down mode. If the selected
external serial memory does not support the Deep power-down mode, the STM32 may
control an external-power switch through a GPIO to remove the power supply of the external
Quad-SPI Flash memory and to cancel its current consumption.
8 Supported devices
The STM32 Quad-SPI interface has a very flexible frame format that permits the following:
• Send up to five phases: instruction – address – alternate byte – dummy – data
• Skip any phase
• Send each phase in one, two or four lines
• Send address in one, two, three or four bytes
• Send one, two, three or four alternate-byte
• Send up to 31 dummy clock cycles.
In addition, STM32 Quad-SPI interface permits sending any command, so the user can
program the desired command in the QUADSPI_CCR register in the INSTRUCTION[7:0]
field.
The STM32 Quad-SPI interface is fully configurable in terms of frame format and hardware
and it supports most Quad-SPI memory in the market.
There are several suppliers of QUADSPI compatible memories, such as Winbond,
Spansion, Macronix, MICRON (Numonyx), Microchip (SST) and others.
9 Conclusion
The STM32 devices provide a very flexible and useful Quad-SPI interface, which fits
memory hungry applications at a lower development cost. The QUADSPI avoids the
complexity of design with external parallel Flash memories by reducing the pin count and
offering better performances. This application note demonstrates the STM32 Quad-SPI
interface performances and flexibility, which allows lower development costs and faster time
to market.
10 Revision history
STMicroelectronics NV and its subsidiaries (“ST”) reserve the right to make changes, corrections, enhancements, modifications, and
improvements to ST products and/or to this document at any time without notice. Purchasers should obtain the latest relevant information on
ST products before placing orders. ST products are sold pursuant to ST’s terms and conditions of sale in place at the time of order
acknowledgement.
Purchasers are solely responsible for the choice, selection, and use of ST products and ST assumes no liability for application assistance or
the design of Purchasers’ products.
Resale of ST products with provisions different from the information set forth herein shall void any warranty granted by ST for such product.
ST and the ST logo are trademarks of ST. All other product or service names are the property of their respective owners.
Information in this document supersedes and replaces information previously supplied in any prior versions of this document.