Skip to content

ESP32-S3 only sees approx. half the SPI-RAM in latest (ESP-IDF v5.x) builds #11853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ma261065 opened this issue Jun 23, 2023 · 23 comments · Fixed by #12141
Closed

ESP32-S3 only sees approx. half the SPI-RAM in latest (ESP-IDF v5.x) builds #11853

ma261065 opened this issue Jun 23, 2023 · 23 comments · Fixed by #12141

Comments

@ma261065
Copy link

I have an ESP32-S3 N16R8 device (i.e. 8MB SPI-RAM and 16MB Flash)

If I flash this build https://www.micropython.org/resources/firmware/GENERIC_S3_SPIRAM_OCT-20230623-unstable-v1.20.0-243-gd93316fe3.bin and then check free memory with gc.mem_free() I see 4254384

If I flash with this (older) build https://www.micropython.org/resources/firmware/GENERIC_S3_SPIRAM_OCT-20230621-unstable-v1.20.0-230-g41c91422f.bin and then check free memory with gc.mem_free() I see 8192976

I believe that the major difference between these builds is that the later one is built against ESP-IDF v 5.x

@ma261065 ma261065 added the bug label Jun 23, 2023
@VynDragon
Copy link
Contributor

Can confirm same issue here. shows about 4 MB on my ESP32S3R8 on micropython master + esp-idf 5.1, and 8 MB when switching back to 1.20.0 and esp-idf 4.4.

@jimmo
Copy link
Member

jimmo commented Jul 3, 2023

This is intentional (although obviously not desirable) and changed due to API differences in IDF 5.x. Is this causing problems for your application (i.e. are you using large frame buffers?).

We are planning an update to the way SPIRAM is managed on ESP32 and planning to make it grow-on-demand. Having it use the entire SPIRAM means that there is nothing left for the IDF heap (e.g. for SSL buffers) and also that the GC has to work a lot harder every collection.

@ma261065
Copy link
Author

ma261065 commented Jul 3, 2023

Thanks for the answer Jim.

Is this causing problems for your application (i.e. are you using large frame buffers?)

I think it will cause a problem, as I'm allocating large buffers for audio playback.

Given the upcoming change, do you know if there is an ETA for the update to allow the SPIRAM to auto-expand? Overall it sounds like a good solution for chips with larger memory.

Having it use the entire SPIRAM means that there is nothing left for the IDF heap

Was this not a problem pre ESP-IDF 5.x on these devices? Why does it have to use all the SPI-RAM, and not just whatever is left over after some portion is reserved for the heap?

Could you please clarify - does this mean that currently there is no way to use approximately 4MB of the memory on a device with 8 MB of SPIRAM (until the auto-expand update happens)? What about from user C modules? Will malloc's there use the otherwise unusable memory?

@VynDragon
Copy link
Contributor

As long as it's configurable it wont cause any issues for me.

@mattytrentini mattytrentini added this to the release-1.21.0 milestone Jul 11, 2023
@nspsck
Copy link
Contributor

nspsck commented Jul 23, 2023

Hi, the issue is mainly caused by this piece of code, where the total heap will be cut in half. You can delete that / 2 and your code will probably work again like before.

Edit: I tired this... and it causes some random problems, maybe you will have luck.

@ma261065
Copy link
Author

ma261065 commented Jul 23, 2023

@nspsck thanks for tracking that down.

From reading the code I think the change is caused by a desire to simplify the task heap allocation. There is a whole code block that was deleted prior to that line which used to make a bunch of decisions based on the size of available SPIRAM. Now the logic seems to be that we get half the RAM at most for the heap, regardless, which prior to this change used to only be the case for chips without SPIRAM. The side effect of this is that we are basically throwing away a big chunk of the memory on the chips with larger SPIRAM amounts.

Given that this issue is tagged for release 1.21, I presume that there will be some changes coming in this logic soon, probably incorporating some of the pending pull requests that deal with Garbage Collector optimisation.

Let's wait and see, as I imagine that the maintainers are flat-out-busy right now working through the side-effects of the big change to ESP-IDF 5.x. And for their hard work we are eternally grateful.

@shariltumin
Copy link

$ diff main.c main.c-ORIG
109,110c109
< // void *mp_task_heap = heap_caps_malloc(mp_task_heap_size, caps);
< void *mp_task_heap = heap_caps_malloc(2000*1024, caps)
---
> void *mp_task_heap = heap_caps_malloc(mp_task_heap_size, caps);

Set the MP heap hardcoded to 2000*1024=2048000. We should get about 2000KB MP heap.

>>>
>>> gc.mem_free()
4273360

Still gc.mem_free() shows about half of the 8MB PSRAM free.

>>> esp32.idf_heap_info(esp32.HEAP_DATA)
[(32767, 32031, 31744, 32015), (8388608, 6338304, 6291456, 6338304), (265392, 230676, 229376, 230676), (22308, 64, 56, 28), (32768, 16864, 13312, 13496), (8132, 7752, 7680, 7752)]
>>>

But esp32.idf_heap_info() reports 'correctly' - (8388608, 6338304, 6291456, 6338304) the 6338304 is about 6MB since MP was set to 2000KB.

gc.mem_free() uses py/gc.c gc_info(), maybe gc.c is broken?

>>> import micropython as mp
>>> mp.mem_info()
stack: 704 out of 15360
GC: total: 4274752, used: 2688, free: 4272064
 No. of 1-blocks: 52, 2-blocks: 14, max blk sz: 18, max free sz: 266968

mem_info() also reports a total of 4274752, but it should be 2048000.

@pillo79
Copy link
Contributor

pillo79 commented Jul 25, 2023

$ diff main.c main.c-ORIG
109,110c109
< // void *mp_task_heap = heap_caps_malloc(mp_task_heap_size, caps);
< void *mp_task_heap = heap_caps_malloc(2000*1024, caps)
---
> void *mp_task_heap = heap_caps_malloc(mp_task_heap_size, caps);

@shariltumin this only affects the memory allocation on the OS side, not its use by MP. The GC is still initialized a few lines after that with the value in mp_task_heap_size, causing lots of trouble 😉
To have consistent results, you can instead force the value of that variable to your desired number (leave some free room for OS needs though!).

@shariltumin
Copy link

Yes, @pillo79 you are right. @dpgeorge pointed out my mistake.
I tested two MP/IDF heap ratios for the 8MB PSRAM esp32s3 board.

  1. 2/6:
$ diff main.c main.c-ORIG 
108,109c108
< // size_t mp_task_heap_size = MIN(heap_caps_get_largest_free_block(caps), heap_total / 2);
< size_t mp_task_heap_size = 2000*1024;
---
> size_t mp_task_heap_size = MIN(heap_caps_get_largest_free_block(caps), heap_total / 2);
  1. 6/2:
$ diff main.c main.c-ORIG 
108,109c108
< // size_t mp_task_heap_size = MIN(heap_caps_get_largest_free_block(caps), heap_total / 2);
< size_t mp_task_heap_size = 6000*1024;
---
> size_t mp_task_heap_size = MIN(heap_caps_get_largest_free_block(caps), heap_total / 2);

Despite the esp-idf warning (up to 4MB), I manage to allocate more than 4MB of memory block. I do not know the impact on system stability, perhaps more knowledgeable persons can explain.

See issue #12075 for a full discussion.

@Tangerino
Copy link

This make my application break :(.

@dpgeorge
Copy link
Member

@Tangerino can you please describe how your application uses RAM and why it breaks? Just so we can better understand how to fix the problem.

@Tangerino
Copy link

Tangerino commented Jul 28, 2023

Sure.
Devices drivers and network topologies are loaded in RAM, we monitor large solar farms with plenty of devices and electrical variables.
The firmware loads user code to perform custom control.
The firmware loads custom alarms definition.
The device supports WiFi, BLE, ETH and 4G all together.
The device is fully remotely managed
We support a remote shell and many other features.
And finally we do have a RAM-DISK in order to try to upload telemetry to the cloud and commit to FLASH only if it fails to publish.

All this uses a lot of RAM.

Using V1.19 (around 1200 devices in production as of today) I can monitor the RAM usage as in.
CleanShot 2023-07-28 at 00 15 43@2x

Now using V1.20 with very little functionality
CleanShot 2023-07-28 at 00 16 47@2x

So, to be 'fair' the application works. But we made a choice to use a device with a bigger RAM in order to do more stuff. Arbitrarily taking half of it has an impact on the overall device capabilities.

Can we optimize the app? Yes, we can, but I was thinking on double the RAM. And the more RAM I add, more RAM I loose.

I'm very surprised with the V1.20 performance! Above I can read 'grow-on-demand'. I think this is the way.

Thank you.

@glancia
Copy link

glancia commented Jul 30, 2023

I work with @Tangerino and I can tell that performance on 1.20 has been multiplied by a factor of 5 to 10 times. Great job guys!

@jimmo
Copy link
Member

jimmo commented Aug 1, 2023

I work with @Tangerino and I can tell that performance on 1.20 has been multiplied by a factor of 5 to 10 times. Great job guys!

@glancia @Tangerino that's very interesting, but quite surprising. Could you clarify whether you mean the v1.20 release or the currently nightly build. I'm guessing the latter as the v1.20 release didn't have the spiram allocation size change.

How are you measuring this performance improvement?

@ma261065
Copy link
Author

ma261065 commented Aug 1, 2023

the v1.20 release didn't have the spiram allocation size change

The current dailies still use the same SPIRAM allocation as 1.20, don't they?

@jimmo
Copy link
Member

jimmo commented Aug 1, 2023

the v1.20 release didn't have the spiram allocation size change

The current dailies still use the same SPIRAM allocation as 1.20, don't they?

@ma261065 No. That's what this issue is all about -- in the move to IDF 5.x (which was done a few weeks ago), the allocation strategy was changed. v1.20 (and nightlies until a few weeks ago) use the old way, and recent nighties use the new way.

@ma261065
Copy link
Author

ma261065 commented Aug 1, 2023

Ahh, sorry - I had remembered that the change happened with 1.20. My bad.

@glancia
Copy link

glancia commented Aug 1, 2023

@jimmo , it was 1.20 release, still cutting RAM by 2. The execution is not breaking because we're not stressing too much, but RAM free came as low as 30kb.

We're not doing any scientific measurements just some operations performance comparisons, running the same application version and same application size:

  • system boot -> 6s on 1.20 vs 18s on 1.19 vs 30-40s on 1.18
  • a simple operation -> 0.5s on 1.20 vs 2.5s on 1.19

We also feel the application running much more fluidly.

@dpgeorge
Copy link
Member

dpgeorge commented Aug 2, 2023

  • system boot -> 6s on 1.20 vs 18s on 1.19 vs 30-40s on 1.18

This could be due to recent improvements with u-module naming, weak links, and reducing the number of times the filesystem is accessed when importing a module.

@jimmo
Copy link
Member

jimmo commented Aug 2, 2023

@jimmo , it was 1.20 release, still cutting RAM by 2. The execution is not breaking because we're not stressing too much, but RAM free came as low as 30kb.

@glancia Can you confirm for sure that it's the v1.20 release? The problem is that the half-spiram change (and the improvements to import which would improve startup time) were not merged until after the v1.20 release. So have you backported the RAM changes to v1.20?

Also @Tangerino asked in a different issue about the removal of zlib, but that was also post-v1.20.

(The slightly confusing thing is that when you build from source, or use a nightly build, the version number printed in the REPL is the previous release -- i.e. we don't increment the version number until the next release -- see #12127).

We're not doing any scientific measurements just some operations performance comparisons, running the same application version and same application size:

* system boot -> 6s on 1.20 vs 18s on 1.19 vs 30-40s on 1.18

As Damien pointed out in the previous comment, this could definitely be explained with the recent import changes.

* a simple operation -> 0.5s on 1.20 vs 2.5s on 1.19

This is very interesting. Great news! But yeah, the last major series of performance improvements were merged just before the v1.18 release.

@ma261065
Copy link
Author

ma261065 commented Aug 2, 2023

The slightly confusing thing is that when you build from source, or use a nightly build, the version number printed in the REPL is the previous release -- i.e. we don't increment the version number until the next release -- see #12127

You are quite correct. v1.20 was released on 27th April 2023, but the change to use IDF 5.x came with commit 6a9db52 on 23rd June 2023

@glancia
Copy link

glancia commented Aug 2, 2023

@Tangerino please confirm which version you used

@Tangerino
Copy link

I'm using the nightly source code.
The boot time is much faster but the overall performance is much better.
I did not make any measurements but GC time went to 0.250 msec to 0,12 msec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants