Skip to content

rp2: RP235: PSRAM support #15620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

Gadgetoid
Copy link
Contributor

@Gadgetoid Gadgetoid commented Aug 8, 2024

Depends upon RP2350 support: #15619

This is a rough draft gathering our PSRAM support code - with a dash of SparkFun's - and attempting to get it into the right shape for mainline.

TODO:

  • Potential: Remove init from main.c and add bindings to modrp2 so PSRAM can be enabled (or not) in boot.py
  • Potential: Support for PSRAM as either extending the GC heap (needs split heap) or storage/tmpfs!?
  • Clean up auto detect code and dispel magic numbers (Note: Have asked for SDK support)
  • Test test test test!

Known issues:

  • Possibly crashy, need a good test to reliably test the extent of the PSRAM
  • Needs to be reconfigured when machine.freq() is called, since a clock change breaks PSRAM timings
  • Currently no way to specify where MicroPython should store a particular object
  • Might break USB on startup or otherwise fail to start up
  • Support for PSRAM opens up RP2 to the class of bugs and pitfalls addressed here: ports/stm32: Add memory map with attributes for different MCU families. #16644

Improvements:

We're (Pimoroni) currently using yet more customisations on top of this PSRAM code for Presto, which stores the entire display buffer in SRAM and leaves basically none for others buffers. See: pimoroni@888f52f

This is a bit rough for inclusion into MicroPython - I always wince at a near duplicate linker script - but incredibly useful nonetheless.

Some kind of bindings for PSRAM rp2.PSRAM() in modrp2 might be nice, too. This would allow us to - theoretically - add a portion of PSRAM to the gc_heap and reserve the remainder for RAMFS, long-lived buffers, custom memory pools and other such nonsense. I did contemplate adding MICROPY_HW_PSRAM_RESERVE_BYTES for this, which would reserve N bytes from the detected PSRAM size for tomfoolery and shenanigans.

Copy link

codecov bot commented Aug 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.57%. Comparing base (17d8234) to head (42035d7).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #15620   +/-   ##
=======================================
  Coverage   98.57%   98.57%           
=======================================
  Files         164      164           
  Lines       21336    21336           
=======================================
  Hits        21031    21031           
  Misses        305      305           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Aug 8, 2024

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:  +224 +0.024% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

@Gadgetoid Gadgetoid changed the title rp2: Add PSRAM support for new RP2350 MCU rp2: RP235: PSRAM support Aug 9, 2024
@Gadgetoid
Copy link
Contributor Author

I've retitled this PR to avoid it being confused with the RP2350 MCU support one 😆

@Gadgetoid
Copy link
Contributor Author

I seem to be having trouble with PSRAM when using split heap, a crude test which uses RAM until a garbage collection is forced will crash/hang on that collection when using split heap, but seems to work fine using exclusively PSRAM.

I'm going to refactor this code slightly to make split heap an option, rather than a requirement.

Copy link
Contributor

@MichaelBell MichaelBell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small remarks (about my code!) while I'm on a train.

// - Max select assumes a sys clock speed >= 120MHz
// - Min deselect assumes a sys clock speed <= 138MHz
// - Clkdiv of 1 is OK up to 133MHz.
qmi_hw->m[1].timing = 1 << QMI_M1_TIMING_COOLDOWN_LSB |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else branch could use a lower max select time, but given we're reprogramming this on every clock frequency change, let's just get it right for whatever the frequency is - I'll send a PR for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we ever correct this? I've lost the plot a little!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gadgetoid Gadgetoid force-pushed the feature/psram branch 4 times, most recently from a732af7 to 7f1fcc7 Compare August 15, 2024 16:02
@Gadgetoid Gadgetoid force-pushed the feature/psram branch 2 times, most recently from 56abd12 to 870d858 Compare August 22, 2024 16:07
@Gadgetoid Gadgetoid force-pushed the feature/psram branch 3 times, most recently from 2fae3a9 to 3cd6781 Compare September 5, 2024 08:24
@Gadgetoid Gadgetoid force-pushed the feature/psram branch 5 times, most recently from e3ba430 to c98c139 Compare September 27, 2024 15:07
@Gadgetoid Gadgetoid force-pushed the feature/psram branch 3 times, most recently from 55880bf to 1585553 Compare October 23, 2024 10:01
@Gadgetoid
Copy link
Contributor Author

we've been making our own releases with the commits from this PR

Likewise! Let me know if you have any issues with the latest very aggressive rebase/refactor. A git diff suggested I didn’t mess anything up (outside perhaps of grouping the changes into the wrong commits) but the flash copy buffer is now 256 bytes instead of 4K.

I ran the test suite and flash benchmarks so I’m pretty sure it at least works. But I’m mostly building from feature branches mixing PSRAM and other things so I won’t be cutting releases from this branch until I rebase those.

@projectgus
Copy link
Contributor

We already have submitted multiple board definitions that make use of the existing PSRAM implementation:

Ah that's great, thanks @sfe-SparkFro. It hadn't dawned on me that the board definitions would be added already with the PSRAM support sitting dormant. 👍

I think we should probably build one of the PSRAM boards in CI to avoid surprise bitrot, but we can add that in a follow-up PR.

@projectgus
Copy link
Contributor

(Note to self: a graph would be nice)

Thanks @Gadgetoid. I dropped these into text files and ran the comparison pass (which isn't as nice as a graph, but it's something):

❯ ./run-perfbench.py -s ./pico2w.txt ./pico2wp_psram.txt 
diff of scores (higher is better)
N=133 M=100                ./pico2w.txt -> ./pico2w_psram.txt         diff      diff% (error%)
bm_chaos.py                    324.72 ->     213.81 :    -110.91 = -34.156% (+/-0.05%)
bm_fannkuch.py                  58.27 ->      72.60 :     +14.33 = +24.592% (+/-0.10%)
bm_fft.py                     2356.17 ->    3056.05 :    +699.88 = +29.704% (+/-0.03%)
bm_float.py                   4833.98 ->    4464.70 :    -369.28 =  -7.639% (+/-0.12%)
bm_hexiom.py                    43.40 ->      41.09 :      -2.31 =  -5.323% (+/-0.07%)
bm_nqueens.py                 2704.03 ->    2994.93 :    +290.90 = +10.758% (+/-0.06%)
bm_pidigits.py                 808.87 ->     572.44 :    -236.43 = -29.230% (+/-0.04%)
bm_wordcount.py                 72.39 ->      47.99 :     -24.40 = -33.706% (+/-0.03%)
core_import_mpy_multi.py       420.04 ->     376.57 :     -43.47 = -10.349% (+/-0.09%)
core_import_mpy_single.py       82.98 ->      75.29 :      -7.69 =  -9.267% (+/-0.49%)
core_locals.py                  59.06 ->      62.11 :      +3.05 =  +5.164% (+/-0.03%)
core_qstr.py                   152.09 ->     177.41 :     +25.32 = +16.648% (+/-0.12%)
core_str.py                     28.64 ->      26.87 :      -1.77 =  -6.180% (+/-0.06%)
core_yield_from.py             402.00 ->     385.85 :     -16.15 =  -4.017% (+/-0.04%)
misc_aes.py                    437.18 ->     548.87 :    +111.69 = +25.548% (+/-0.07%)
misc_mandel.py                2435.39 ->    3365.37 :    +929.98 = +38.186% (+/-0.07%)
misc_pystone.py               2069.15 ->    1932.38 :    -136.77 =  -6.610% (+/-0.05%)
misc_raytrace.py               345.46 ->     278.90 :     -66.56 = -19.267% (+/-0.04%)
viper_call0.py                 574.36 ->     569.95 :      -4.41 =  -0.768% (+/-0.03%)
viper_call1a.py                560.19 ->     556.04 :      -4.15 =  -0.741% (+/-0.02%)
viper_call1b.py                457.61 ->     454.76 :      -2.85 =  -0.623% (+/-0.03%)
viper_call1c.py                464.69 ->     461.78 :      -2.91 =  -0.626% (+/-0.02%)
viper_call2a.py                549.96 ->     545.87 :      -4.09 =  -0.744% (+/-0.02%)
viper_call2b.py                405.63 ->     403.41 :      -2.22 =  -0.547% (+/-0.02%)

The numbers are all over the place, which is to be expected with flash caching differences - just a different binary layout can be more or less cache-friendly for a particular build, and the cache behaviour is probably totally different with PSRAM. However, I think it's encouraging that nothing is massively slower with PSRAM enabled and there is a reasonable mix of faster and slower.

@projectgus
Copy link
Contributor

I seem to have hit some code formatting edge case where I vehemently disagree with its changes.

This is definitely one of the weirder set of uncrustify formatting choices that I've seen it make. 😆

I've pushed a fixup commit here that seems to keep it happy:
pimoroni/micropython@feature/psram...projectgus:micropython:feature/psram

Feel free to cherry-pick and squash into this PR.

@scruss
Copy link
Contributor

scruss commented Mar 25, 2025

I think we should probably build one of the PSRAM boards in CI to avoid surprise bitrot, but we can add that in a follow-up PR.

It's also in @mattytrentini's board def for the WeAct Studio RP2350B in the main repo:
micropython/ports/rp2/boards/WEACTSTUDIO_RP2350B_CORE/mpconfigboard.h:

// TODO: Split PSRAM option off as a variant
#define MICROPY_HW_PSRAM_CS_PIN (0)
#define MICROPY_HW_ENABLE_PSRAM (0)

@Gadgetoid Gadgetoid force-pushed the feature/psram branch 2 times, most recently from c8eb6c5 to eb9e0b3 Compare March 26, 2025 09:32
@Gadgetoid
Copy link
Contributor Author

WeAct Studio RP2350B

This could be a good candidate for covering a couple of bases at once. (B variant and PSRAM)

Feel free to cherry-pick and squash into this PR.

Done, thank you!

I have added COPY_BUFFER_SIZE_BYTES and set it to FLASH_PAGE_SIZE for now, making the copy buffer code a little more explicit about what it's doing and a little easier to change.

@dpgeorge
Copy link
Member

  • Potential: Support for PSRAM as either extending the GC heap (needs split heap) or storage/tmpfs!?
    ...
    Some kind of bindings for PSRAM rp2.PSRAM() in modrp2 might be nice, too. This would allow us to - theoretically - add a portion of PSRAM to the gc_heap and reserve the remainder for RAMFS, long-lived buffers, custom memory pools and other such nonsense. I did contemplate adding MICROPY_HW_PSRAM_RESERVE_BYTES for this, which would reserve N bytes from the detected PSRAM size for tomfoolery and shenanigans.

Yes, I agree that it would be good to give more control to the user as to how they use PSRAM. There are lots of possibilities here about how the API could look and what functionality it would have. And that could also be implemented on esp32 and other ports that may have large external (and slower) RAM.

But in the interests of getting this PR across the line and merged, I think what you have here is pretty good: just allowing a board to enable PSRAM or not, and using SRAM+PSRAM or just PSRAM for the GC heap. That's nice and simple, for now.

The only downside of doing it that way now is that if we change it in the future, it'll be a backwards incompatible change (because most likely we will change it so there is less RAM available by default on start up).

@Gadgetoid
Copy link
Contributor Author

Gadgetoid commented Mar 26, 2025

The only downside of doing it that way now is that if we change it in the future, it'll be a backwards incompatible change (because most likely we will change it so there is less RAM available by default on start up).

As someone relying heavily on this code in shipping products- that ship has already sailed 😆 at least for some of us, anyway. I think it could be reasoned that it'll be mostly vendors- not end users- who will see the fallout from such a compatibility break? Gotta keep us on our toes!

@Gadgetoid Gadgetoid force-pushed the feature/psram branch 6 times, most recently from 7441621 to bc2fb20 Compare March 28, 2025 12:57
@dpgeorge
Copy link
Member

dpgeorge commented Apr 7, 2025

I tested this on a Pimoroni Pico Plus2 RP2350 (thanks @Gadgetoid for it!), which has 8MiByte PSRAM.

I ran the standard MP test suite. And also an extensive RAM test that I have which does 8-, 16- and 32-bit write and then verification (using a PRNG), in sequential forward and sequential reverse mode (the entire RAM) and also in random access mode.

All tests passed.

Gadgetoid and others added 4 commits April 7, 2025 15:27
Add PSRAM support with auto detection.

Performs a best-effort attempt to detect attached PSRAM,
configure it and *add* it to the MicroPython heap.

If PSRAM is not present, should fall back to use internal
RAM.

Introduce two new port/board defines:

* MICROPY_HW_ENABLE_PSRAM to enable PSRAM.
* MICROPY_HW_PSRAM_CS_PIN to define the chip-select pin.

Changes:

ports/rp2/rp2_psram.c/h: Add new PSRAM module.
ports/rp2/main.c: Add optional PSRAM support.
ports/rp2/CMakeLists.txt: Include rp2_psram.c.
ports/rp2/mpconfigport.h: Add MICROPY_HW_ENABLE_PSRAM.
ports/rp2/modmachine.c: Reconfigure PSRAM on freq change.

Co-authored-by: Kirk Benell <kirk.benell@sparkfun.com>
Co-authored-by: Mike Bell <mike@mercuna.com>
Signed-off-by: Phil Howard <phil@gadgetoid.com>
PSRAM will be used exclusively if MICROPY_GC_SPLIT_HEAP == 0,
it will be added to RAM if MICROPY_GC_SPLIT_HEAP == 1,
and the system will fall back to RAM only if it's not detected.

Due to the size of PSRAM, GC stack was overflowing and causing
the GC to scan through the entire memory pool.

This caused noticable slowdowns during GC. Increase the stack
from 256 to 4096 bytes to avoid overflow and increase the
stack entry type size to accomodate 8MB+ PSRAM.

Changes:

ports/rp2/mpconfigport.h: Make split-heap optional and enable by default.
			  Increase GC stack entry type to uint32_t.
			  Raise GC stack size.

Co-authored-by: Kirk Benell <kirk.benell@sparkfun.com>
Signed-off-by: Phil Howard <github@gadgetoid.com>
Add a 256 byte (FLASH_PAGE_SIZE) SRAM copy buffer to allow copies
from PSRAM to flash.

This would otherwise hardfault since PSRAM is disabled to write flash.

Changes:

ports/rp2/rp2_flash.c: Add 256 byte (flash page size) SRAM copy buffer
                       for PSRAM to flash copies.
                       Invalidate the XIP cache to purge any PSRAM
                       data before critical flash operations.

Co-authored-by: Phil Howard <github@gadgetoid.com>
Co-authored-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Phil Howard <github@gadgetoid.com>
Configure flash timings dynamically to match the system clock.

Reconfigure timings after flash writes.

Changes:

ports/rp2/main.c: Set default flash timings.
ports/rp2/modmachine.c: Configure optimal flash timings on freq change.
ports/rp2/rp2_flash.c: Reconfigure flash when leaving critical section.

Signed-off-by: Phil Howard <github@gadgetoid.com>
@dpgeorge
Copy link
Member

dpgeorge commented Apr 8, 2025

Merged in 11f057d through 91cff8e

Thanks for everyone's effort here, this is a good feature!

@dpgeorge dpgeorge closed this Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants