-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py/gc: Support multiple heaps (version 2). #3580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Enable the addition of heap space at runtime. Advantages: - The ESP32 has a fragmented heap so to use all of it the heap must be split. - Support a dynamic heap while running on an OS, adding more heap when necessary.
I see that even this introduces some extra code when multiheap is disabled (in the minimal port with CROSS=1). I'll look further into how this can be avoided. |
Really nice work, thank you! I did have this feature on my to-do list but it was low priority so I didn't make an progress on it. From a brief look it looks ok. It would be nice if it didn't mak such big changes to gc.c but that's probably unavoidable given the feature that it's adding.
It could be because some |
would you mind adding a snippet explaining how you got the esp32 to 200kb of heap? edit: added snippet From 99e9f5ece72858f0ebfb655bb198ec3ac1f9be96 Mon Sep 17 00:00:00 2001
From: Andreas Valder <nd@serioese.gmbh>
Date: Sun, 4 Feb 2018 15:53:56 +0100
Subject: [PATCH] feat(memory): use multiple heaps for micropython to handle
the esp32 fragmented memory
---
ports/esp32/main.c | 10 ++++++++++
ports/esp32/mpconfigport.h | 1 +
2 files changed, 11 insertions(+)
diff --git a/ports/esp32/main.c b/ports/esp32/main.c
index 93423e1c..971b6a99 100644
--- a/ports/esp32/main.c
+++ b/ports/esp32/main.c
@@ -60,6 +60,9 @@
# define MP_TASK_HEAP_SIZE (96 * 1024)
#endif
+#define HEAP_CHUNK_SIZE (8*1024)
+#define HEAP_CHUNK_COUNT (8)
+
STATIC StaticTask_t mp_task_tcb;
STATIC StackType_t mp_task_stack[MP_TASK_STACK_LEN] __attribute__((aligned (8)));
STATIC uint8_t mp_task_heap[MP_TASK_HEAP_SIZE];
@@ -81,6 +84,13 @@ soft_reset:
mp_stack_set_top((void *)sp);
mp_stack_set_limit(MP_TASK_STACK_SIZE - 1024);
gc_init(mp_task_heap, mp_task_heap + sizeof(mp_task_heap));
+ void* p;
+ for (int i =0; i< HEAP_CHUNK_COUNT; i++) {
+ p = malloc(HEAP_CHUNK_SIZE);
+ if (p == NULL)
+ break;
+ gc_add(p, p+HEAP_CHUNK_SIZE);
+ }
mp_init();
mp_obj_list_init(mp_sys_path, 0);
mp_obj_list_append(mp_sys_path, MP_OBJ_NEW_QSTR(MP_QSTR_));
diff --git a/ports/esp32/mpconfigport.h b/ports/esp32/mpconfigport.h
index 4b4e08df..6e4d8842 100644
--- a/ports/esp32/mpconfigport.h
+++ b/ports/esp32/mpconfigport.h
@@ -54,6 +54,7 @@
#define MICROPY_SCHEDULER_DEPTH (8)
#define MICROPY_VFS (1)
#define MICROPY_VFS_FAT (1)
+#define MICROPY_GC_MULTIHEAP (1)
// control over Python builtins
#define MICROPY_PY_FUNCTION_ATTRS (1)
--
2.16.1 |
A better way would be this patch, which only adds one extra heap area instead of many like in your example. Every new heap area added slows down every heap operation in O(n) at the moment, so you'll want as few as possible. |
Can an allocation span multiple heap blocks? |
Heap blocks are fixed in size (typically 16 bytes) so any allocation which exceeds 16 bytes will span multiple blocks. When you look at a heap dump (i.e.
Each character corresponds to a heap block. The ones with a letter followed by ='s are single objects that span multiple blocks. For example (immediately after the above):
You can see the 4000 block allocation crossing many heap blocks. Any single allocation is contiguous, so it needs contiguous heap blocks. |
Not what I meant (but thanks for that detailed explanation). This new multi-heap increases the heap size. If the current heap has X bytes left but I want to allocate Y > X bytes, it would be nice if I could take X bytes from current heap and only allocate (Y-X) bytes from adjacent heap. This would be really useful if Y is really big and X is almost as big as Y. Basically I’m asking if you could treat the individual heaps as one contiguous region for the purpose of allocation. |
Since individual allocations need to be contiguous, it wouldn't be possible to have a single object span multiple heap areas. |
In particular allocations for objects supporting the Python buffer protocol. |
I was looking into implementing exactly this, and was super happy to find it has already been done! The discussion seems to have slumbered a bit, but this feature is tremendously useful on microcontrollers with non-contiguous memory regions. Can this change be mainlined? It would be an amazing addition. |
Any plans to merge this or something alike? Would make the available memory on something like the esp32 much larger. |
See #5543 (comment) for some more context, but at the moment the areas of memory not used for the MicroPython heap are still used by the IDF for things like SSL buffers. So in order to make this change work, we'd also have to make the IDF and libraries (e.g. mbedtls) use the MicroPython heap instead of malloc directly. Which is starting to get a bit messy. |
On our uPy fork, I can add at least 14kB from the DRAM block at 0x3FFE0440 without any issues. (The TLS outbound buffer was increased from 4 to 8k as suggested in #5543, to fix TLS with large keys.) We get this block specifically by first taking the 111kB free block at 0x3FFE4350, then requesting the largest free block (which allocates 0x3FFE0440), and finally freeing the 111kB again for WiFi. This results in a ~17.5% larger uPy heap, which is helpful for our purpose.
|
FWIW, if you're not planning on using WiFi or Bluetooth, the same greedy allocation strategy as above can be used to allocate all free DRAM blocks, which nets you a ~200kB uPy heap. |
@tjclement, |
@formigarafa of course, you can see here where I added this in our codebase: badgeteam/ESP32-platform-firmware@3ed58b4#diff-325ac9f54774c3d5d2604108c9759715 |
@tjclement have you seen e600810? It may be more efficient to use that instead as it only works with a single block (and thus the GC can work faster). |
I've rebased and optimised this PR in #8526. |
Closing because development continues in #8526. |
Enable the addition of heap space at runtime. Advantages:
Rewritten PR of #3533. The biggest difference is that multiple heaps support can now be disabled (and is disabled by default) to reduce code size. I hope it is also an more stable as I did the changes after looking how the memory manager actually works.
With this code, I managed to extend the MicroPython heap to ~200kB on the ESP32:
This is necessary because by default, the esp32 does not have a contiguous memory area:
I have tested using
tests/run-tests
and haven't seen a regression (both on unix and esp32).Image size changes:
I have tried to keep the image sizes unchanged, but some changed anyway for some reason. Maybe the optimizer is less effective on some ports than other ports. Most of these ports (with the exception of bare-arm) jumped around a lot during development, so I'm suspecting it's mostly just an inconsistent optimizer.