Skip to content

stm32: add sdram support #3940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

andrewleech
Copy link
Contributor

This PR adds stm32 support for using external SDRAM chip for the micropython GC heap.

On the (included) STM32F429DISC target this means you now have 8MB of ram for python to use!

I expect this PR to need a little further work before merging however. While I believe it's fully operational and stable, I'm pretty sure some of my changes to alternate pin function handling could be done better.

It includes also a rather long list of pins needing to be defined in multiple locations. This is the first time I've worked with the alternate function handling, not sure if I've understood it correctly/efficiently?

In particular I had problems with the FMC perpheral configuration as the existing functionality wants to set it to unit 0 but the register definition is just FMC, hence my change in stm32f4xx_prefix.c. I don't know that this change is safe for all stm32 targets.

Even then I still needed a define in mpconfigboard.h to set the active FMC register #define FMC FMC_SDRAM_DEVICE
This limits the FMC peripheral in alternate function handling to one of the possible memory types, and sets a fairly non-specific define FMC that could conflict elsewhere.

I based the initial driver on an old (unused?) driver from openmv, as such I've referenced where it came from in the initial commit and left their header intact.
I havne't contacted them about this yet, I hope this is ok? I've modified the file fairly significantly from the initial version.

@dpgeorge
Copy link
Member

Thanks for the contribution, it's a pretty nice feature to have!

BTW: did you do any speed tests for the SDRAM? Would be interesting to know how much of a speed penalty there is.

I'm pretty sure some of my changes to alternate pin function handling could be done better.

Yes, it need some further review to understand these changes. One simple way to do it is just use a fixed AF number for the FMC configuration, eg mp_hal_pin_config(pin, MP_HAL_PIN_MODE_ALT, MP_HAL_PIN_PULL_NONE, 12). On the F429 it looks like that's the only option. On F7xx MCUs there are two options for the AF number, and to support that case there could be a compile-time config macro per pin, like MICROPY_HW_FMC_A0_ALT. Yes, that'll double the number of config values for the pins in mpconfigboard.h. But that's just compile-time stuff and doesn't increase the firmware size. OTOH, adding all the AF definitions for the FMC pins like it's currently done in this PR adds a lot of bytes to the firmware. These pins would never (?) be configured dynamically at runtime so there's no real need to have the AF list available to the Python code.

This is the first time I've worked with the alternate function handling, not sure if I've understood it correctly/efficiently?

To get dynamic pin assignment and selection of different AF numbers at runtime, what you did is correct and as efficient as it could be. But as said above, I don't think this needs to be configured at runtime and could instead just be a fixed AF number, or set by mpconfigboard.h for all pins.

In particular I had problems with the FMC perpheral configuration as the existing functionality wants to set it to unit 0 but the register definition is just FMC, hence my change in stm32f4xx_prefix.c. I don't know that this change is safe for all stm32 targets.

It's an interesting change and may help with other peripherals that only have one unit, eg ETH. But it would be better not to need such a change if possible.

I based the initial driver on an old (unused?) driver from openmv, as such I've referenced where it came from in the initial commit and left their header intact.
I havne't contacted them about this yet, I hope this is ok? I've modified the file fairly significantly from the initial version.

Since it's MIT licensed it is OK.

@iabdalkader do you currently use the SDRAM driver you wrote?

@andrewleech
Copy link
Contributor Author

I haven't measured any of the timings yet, though creating a multi-megabyte bytearray through tuple comprehension takes a number of seconds.... this is more likely the comprehension speed than the ram speed though.
I should try timing a copy of a large buffer to get a better idea of the round-trip of just ram.

On a related note, if/once #3580 is done we could still have the internal ram operational and perhaps longer term look at splitting it up based on block sizes? hint the gc to try to assign small objects in the faster internal, larger objects in external.

My primary development platform is a proprietary F765 board so different part configuration support is important to me, but yes I'd also prefer to not add flash bloat for dynamic pin handling if it can be cleanly implemented statically. The disco boards are ideal to use as dev platforms for contributing here though!

I do really like what you've done with the alternate fn handling so the build system can automatically lookup the correct af number for the pin. I'll have a think about a clean way to extend the make-pins script to be able to statically #define these at compile where needed rather than having to manually lookup and set somewhere.

@iabdalkader
Copy link
Contributor

iabdalkader commented Jul 13, 2018

@iabdalkader do you currently use the SDRAM driver you wrote?

@dpgeorge No it's an outdated driver for an OMV board with SDRAM, feel free to use it. Note the timings are for a specific SDRAM part.

@dpgeorge
Copy link
Member

On a related note, if/once #3580 is done we could still have the internal ram operational and perhaps longer term look at splitting it up based on block sizes? hint the gc to try to assign small objects in the faster internal, larger objects in external.

Yes, it would be useful to eventually support multiple heaps, in internal and external RAM.

I'll have a think about a clean way to extend the make-pins script to be able to statically #define these at compile where needed rather than having to manually lookup and set somewhere.

That's a great idea. Something like it's currently done dynamically, but statically instead. Although, I'm not quite sure how it would be done. Eg if you have #define MICROPY_HW_SDMMC_D0 (pin_C8) then you'd want a macro (or at least compile-time constant) that evaluates to the alt func number for this pin, eg PIN_ALT_FUNC_SDMMC1(MICROPY_HW_SDMMC_D0) should evaluate to 12.

If it's not possible to do at compile time then it could be done at runtime by doing a look-up in a table (or search), but it shouldn't need to generate all the interned strings like it does at the moment.

@andrewleech
Copy link
Contributor Author

So, I've had some fun with auto-generated defines and macros, removing the hacks previously needed to make it work.

There's a couple of new macros in micropython/ports/stm32/mphalport.h which may be more global than desired, let me know, but I think the alternate pin config I've got in place now should be quite efficient and still just as easy to use.

Other than that there's a few other cleanup and a critical fix for F7's to avoid hard faults when non-aligned memory accesses are made to the sdram.
I don't know if this will be required for other cores, the reference I listed in the commit only mentioned it needed for F7. Without the MPU settings, the ram test would pass and often the first few python operations work fine before running something random causes a hard fault.

On the topic of speed:

import utime
import gc    
gc.collect()
really_big_buffer = bytearray(1*1024*1024)
start = utime.ticks_ms()
_ = really_big_buffer[:]
end = utime.ticks_ms()
print("Copied in %d ms" % (end-start))

Reports about 80ms to copy a 1MB buffer on the STM32F429DISC board.
When I switched to 2MB buffer the first copy too 120ms, then re-running takes ~470ms.
I presume this is the extra time to gc the heap and find a large enough block of space.

@dpgeorge
Copy link
Member

Other than that there's a few other cleanup and a critical fix for F7's to avoid hard faults when non-aligned memory accesses are made to the sdram.
I don't know if this will be required for other cores, the reference I listed in the commit only mentioned it needed for F7. Without the MPU settings, the ram test would pass and often the first few python operations work fine before running something random causes a hard fault.

Hmm, that's a bit worrying that it needs the MPU. It would be good to avoid the use of the MPU because it's only possible to set the MPU up once and it may be needed for more "important" things. Eg the Ethernet in #3808 currently uses MPU but it should be possible to eliminate that need.

If it's just a non-aligned access then that should be fixed. Are you able to find the location of that fault?

Otherwise the crashes could be due to cache coherence, which would be more difficult to fix.

@andrewleech
Copy link
Contributor Author

andrewleech commented Jul 17, 2018

This is the reference I used for the alignment / mpu issue: http://www.keil.com/support/docs/3777.htm

If the MPU is an issue I could try the memory range remapping as suggested, but that would limit other concurrent FMC uses in future.

#define MICROPY_HW_SDRAM_WRITE_PROTECTION (0)

// These should be declared as just the pin name as written in pins.csv
#define MICROPY_HW_FMC_SDCKE1 PB5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame these need to be defined as raw names like this, rather than (pin_B5). So it'll end up being different to other pin definitions in this config file. I don't think there's any way around this because (pin_B5) will expand to (&pin_B5_obj).

But, I think these names here should follow the CPU name (second column of pins.csv) and be simply B5, without the P.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I tried to figure out a way to leave it as (pin_PB5) but I couldn't see any way to have it all map through in the preprocessor rather than needing a runtime function and lookup table.

I did go back and forth about using the pin defined name or the cpu name, left it as defined name in case people wanted to name the fmc/smc pins in the csv (like the usb pins are named).
That's then more in keeping with the other pins in the file with their defined names, even though the format of them is different.

I can switch it back to pin cpu name if you think that'll be less confusing in future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I tried to figure out a way to leave it as (pin_PB5) but I couldn't see any way to have it all map through in the preprocessor rather than needing a runtime function and lookup table.

It might be possible by the following kind of pattern:

#define STATIC_AF(pin_obj, fn, type) \
    pin_obj == &pin_A0_obj && fn == TIM2 && type == CH1 ? 1 : \
    pin_obj == &pin_A0_obj && fn == USART2 && type == CTS ? 7 : \
    ... all possible pins and AFs! ...
    {} /* something that will give a compile error if no alt func found */

Yes that's a bit crazy but it should work, and would be generated, not written by hand. It could be simplified by combining fn and type into one arg to the macro, like STATIC_AF(pin_obj, fn_type), so call it as STATIC_AF(pin_A0, TIM2_CH1).

It could also be split into a macro per function, eg:

#define STATIC_AF_TIM2_CH1(pin_obj) \
    pin_obj == &pin_A0_obj ? 1 : \
    pin_obj == &pin_A5_obj ? 1 : \
    pin_obj == &pin_A15_obj ? 1 : \
    {}

But that might not be as flexible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not bad..... I've got no issue with a long generated header file for the compiler to parse through, I can't imagine it would slow compilation down too much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be wise to put this macro beast in a separate header, to be included only when needed. Because at the moment mphalport.h is included in a lot of source files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't as simple as I first thought.

fn == TIM2 && type == CH1 doesn't just work as TIM2 and CH1 are undefined. I've worked around this by joining them into one larger token and then making an enum of all these tokens so they've got something to match to.

I haven't been able to make this work in a macro: pin_obj == &pin_A0_obj
it complains about trying to use & as an operator; the preprocessor can't do pointers.

constexpr would be perfect but we don't have access to that being C. I'll keep trying

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to make this work in a macro: pin_obj == &pin_A0_obj
it complains about trying to use & as an operator; the preprocessor can't do pointers.

That's true. It would need to be something the compiler can optimise to a single constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out is wasn't as hard as I thought there for a bit, see latest commit!

@@ -17,6 +18,7 @@
'SPI' : ['NSS', 'SCK', 'MISO', 'MOSI'],
'SDMMC' : ['CK', 'CMD', 'D0', 'D1', 'D2', 'D3'],
'CAN' : ['TX', 'RX'],
'FMC' : [],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I did... in my new generated header file I have it excluding peripherals not listed here, but that was from an earlier implementation where I didn't have the pin type in the define as well. Now that I do have that listed in the #define as well it all works no worries without the SUPPORTED_FN filter. I'll fix it.

for af in named_pin.pin().alt_fn:
func = "%s%d" % (af.func, af.fn_num) if af.fn_num else af.func
if af.func not in SUPPORTED_FN:
print("//", end='', file=af_defs_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to have these commented out. Having the additional defines for all pins (even if not used) doesn't really hurt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as per the last comment, I did need this previously, don't now.

@@ -75,3 +76,9 @@ void mp_hal_gpio_clock_enable(GPIO_TypeDef *gpio);
void mp_hal_pin_config(mp_hal_pin_obj_t pin, uint32_t mode, uint32_t pull, uint32_t alt);
bool mp_hal_pin_config_alt(mp_hal_pin_obj_t pin, uint32_t mode, uint32_t pull, uint8_t fn, uint8_t unit);
void mp_hal_pin_config_speed(mp_hal_pin_obj_t pin_obj, uint32_t speed);

#define __JOIN( a, b ) a ## _ ## b
#define __STATIC_AF( name, fn, type ) PIN_ ## name ## _AF_ ## fn ## _ ## type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this needs to stay, so would be better to name something that won't clash like MP_HAL_PIN_STATIC_AF.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

#define __STATIC_AF( name, fn, type ) PIN_ ## name ## _AF_ ## fn ## _ ## type

#define mp_hal_pin_config_alt_static(pin_name, mode, pull, fn, type) \
mp_hal_pin_config(__JOIN(pyb_pin, pin_name), mode, pull, __STATIC_AF(pin_name, fn, type))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be MP_HAL_PIN_STATIC_NAME(pin_name) then use:

#define MP_HAL_PIN_STATIC_NAME(name) pin_ ## name

to match with raw CPU names, not pyb-level names.

@andrewleech andrewleech force-pushed the stm_sdram branch 2 times, most recently from f44e385 to c0c6541 Compare July 17, 2018 08:19
@dpgeorge
Copy link
Member

Nice way to implement the static alt func lookup!

To prepare this for eventual merge, it would be best if the changes to support static alt func settings were cleanly separated from the addition of the SDRAM driver. And ideally the static alt func commits would come before the SDRAM commits, to keep the history clean and logical (to make it easier to review the changes and find possible bugs in the future with git bisect). @andrewleech would you be able to make such changes to the PR? It might be easier if you open a separate PR for the alt func settings first.

@andrewleech
Copy link
Contributor Author

Cheers, yep I'll be able to split them out no worries in the morning.

@andrewleech
Copy link
Contributor Author

This has been rebased/restructured on top of #3952.

Hopefully this set of commits make more sense as a feature and will make it clearer for to configure new targets in future.

@andrewleech andrewleech force-pushed the stm_sdram branch 2 times, most recently from c09c833 to 2401151 Compare July 18, 2018 04:05
@andrewleech
Copy link
Contributor Author

Rebased onto master now that #3952 is closed

@dpgeorge
Copy link
Member

Travis is failing because it won't build with make BOARD=STM32L476DISC, since these boards don't have SDRAM files in the HAL.

@andrewleech
Copy link
Contributor Author

Sorted, that file is now only included in make for f4, f7, h7

@dpgeorge
Copy link
Member

I was just going through this to merge it and noticed it would be cleaner in sdram.c if the check #ifdef FMC_SDRAM_BANK was used just once and wrapped all functions, instead of doing the check individually in each function. That would then lead to a link error (missing sdram_init) if you try to enable SDRAM but don't define one of the SDCKE and SDNE pins. It would also eliminate a few the false/NULL returns in the sdram.c function which would likely never be used (if you call any code in sdram.c you'll most likely have SDRAM enabled).

If you agree, I can make such a change during the merge, or feel free to make the change here in this PR.

@andrewleech
Copy link
Contributor Author

That does sound much cleaner, I must admit I added those ifdef's rather ad-hoc at the last minute to deal with the compile failures, should have thought about it a bit clearer.

If you don't mind fixing in merge that'd be great, my working tree is a bit of a mess right now (I'm trying build a new camera driver and merge some of your mprepl functionality into the jupyter notebook mp kernel I use: https://github.com/goatchurchprime/jupyter_micropython_kernel)

@dpgeorge
Copy link
Member

Ok, now merged in 7ae053a through 434975d.

Thanks for this!

@dpgeorge dpgeorge closed this Jul 23, 2018
@andrewleech
Copy link
Contributor Author

Thanks for the reviews and testing to help make it better!

tannewt added a commit to tannewt/circuitpython that referenced this pull request Jan 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants