Skip to content

rp2/modmachine: Fix lightsleep returning early. #16412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 20 additions & 3 deletions ports/rp2/modmachine.c
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,7 @@ static void mp_machine_lightsleep(size_t n_args, const mp_obj_t *args) {
xosc_dormant();
} else {
bool timer3_enabled = irq_is_enabled(3);
uint64_t endtime_us_64 = timer_time_us_64(timer_hw) + delay_ms * 1000;

const uint32_t alarm_num = 3;
const uint32_t irq_num = TIMER_ALARM_IRQ_NUM(timer_hw, alarm_num);
Expand All @@ -216,7 +217,7 @@ static void mp_machine_lightsleep(size_t n_args, const mp_obj_t *args) {
#error Unknown processor
#endif
timer_hw->intr = 1u << alarm_num; // clear any IRQ
timer_hw->alarm[alarm_num] = timer_hw->timerawl + delay_ms * 1000;
timer_hw->alarm[alarm_num] = ((uint32_t)((endtime_us_64) & 0xffffffff));
} else {
// TODO: Use RTC alarm to wake.
clocks_hw->sleep_en0 = 0x0;
Expand Down Expand Up @@ -245,8 +246,24 @@ static void mp_machine_lightsleep(size_t n_args, const mp_obj_t *args) {
#endif
#endif

// Go into low-power mode.
__wfi();

// A timer other than #3 could fire.
// Repeatedly go into low-power mode until timer 3 is no longer armed or
// there is less that 30 microseconds left to go.
//
// This means lightsleep can return 50 microseconds early.
// That is still close, even for a 1 millisecond sleep.
// But this masks a race condition where and interrupt occurs after the armed bit
// is checked but before __wfi() is called.
// Note: 50 microseconds is a magic number.
// Given that all interrupts are masked except the interrupt for timer_hw
// This should be more than enough to mask the race condition.

uint64_t currenttime_us_64 = timer_time_us_64(timer_hw);
while (timer_hw->armed & (1u << alarm_num) && (endtime_us_64 - currenttime_us_64) > 50) {
__wfi();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure there's a race condition in here: between the check of the timer and the wfi the interrupt may go off, in which case the wfi will sit there waiting "forever" because the interrupt already fired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.
Needs to be fixed, but it won't wait forever. Timer 2 is going off about every 100 ms which will cause the __wfi() to return.

@dpgeorge, it is not obvious to me how to resolve the race condition.
I still a relative neophyte on ARM and micropython internals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterharperuk tried to fix this before but there's not really any way to do it.

A proper fix needs the pico-sdk update (which fixes issues with the alarm pool and wfe), but the latest version of that has a new alarm pool bug...

Copy link
Contributor

@peterharperuk peterharperuk Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've fixed the latest issue and have a tentative plan to do a small sdk update in the new year.
Also, this change would prevent wakeup for USB interrupts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterharperuk, it does not prevent wakeup for USB interrupts it just delays it. Potentially a lot.

Without my change the single __wfi() wakes up every 60 to 100 milliseconds due to the change in 74fb42a which added code using hardware timer 2.
USB interrupts will be masked for that time.

With the while loop I added USB interrupts will be masked for the entire delay time.
But they will get serviced eventually.

I did notice while testing that if I did a lightsleep(10000) and hit the thonny stop button the prompt would not come back until the sleep expired.
Longer delays like lightsleep(30000) and thonny would timeout if I hit the stop button.
But once the lightsleep completed (after 30 seconds) hitting the stop button again would bring it back.

I suggest to you that someone calling lightsleep() for a long time (EG: several minutes) needs to be aware thonny will timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@projectgus has a pull request open for the same issue.

Likely I will close this pull request without a merge and his fix will go in.

It is end of day were I live. I will get back to this tomorrow.

currenttime_us_64 = timer_time_us_64(timer_hw);
}

if (!timer3_enabled) {
irq_set_enabled(irq_num, false);
Expand Down
Loading