Skip to content

flush filesystem before resetting heap #1649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 17, 2019

Conversation

dhalbert
Copy link
Collaborator

I think this fixes #1643.

There's a RAM filesystem cache on the heap, in some circumstances, and it was not flushed and discarded before the heap was reset.

Also fixed minor problem in display background task, setting reload exception after it was already set.

@dhalbert dhalbert requested a review from tannewt March 16, 2019 20:51
@jerryneedell
Copy link
Collaborator

tried on feather_nrf52840 - same results -- reboots to safe mode -- FS erased

@dhalbert
Copy link
Collaborator Author

@jerryneedell Thanks for testing. Your test is not running code.py; I'm thinking I need to do the same filesystem_flush() on the user-initiated soft reload. Will check on that later tonight.

@dhalbert
Copy link
Collaborator Author

@jerryneedell Added another flush, and I believe I tested your scenario successfully.

@makermelissa
Copy link
Collaborator

I just tried your commit @dhalbert and it already is appearing to function better (i.e. no crashes yet) on the Feather M4 Express.

@kevinjwalters
Copy link

Would this have affected 3.x too?

@jerryneedell
Copy link
Collaborator

jerryneedell commented Mar 17, 2019

works great on the feather_nrf52840!

works on pyportal as well!!
Nice job!

@dhalbert
Copy link
Collaborator Author

@kevinjwalters it's possible it was a latent issue in 3.x, but it might have been ameliorated by different timing. There is a ram flash cache in 3.x as well.

@dhalbert dhalbert changed the title after code.py runs, flush filesystem before resetting heap flush filesystem before resetting heap Mar 17, 2019
Copy link
Collaborator

@makermelissa makermelissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been playing with the Feather M4 Express for a while with this change and it's working much better. Looks like I don't have write access, so my approval doesn't mean much.

@uhrheber
Copy link

That didn't do it.
I merged your pull request, compiled and flashed it to a pca10059, then I copied my BLE colourpicker code + libs to the drive.
The code started to run, but stopped by itself after about 20 seconds, without me doing anything.
After unplugging and replugging, the drive was wiped.

@jerryneedell
Copy link
Collaborator

jerryneedell commented Mar 17, 2019

I tried this PR on my PCA10059 - It has been awhile since I used it and I don't recall the state I left it in. After loading the new image, I found that my FS was corrupted, but then after erasing and reloading the FS (using storage.erase_filesystem()), it has been worked normally -- with a BLE Uart test and Eddystone beacon test.

@jerryneedell
Copy link
Collaborator

jerryneedell commented Mar 17, 2019

Hmmm -- Update -- PCA10059 wiped itself -- not sure when or why but I first found that I could not run a script I had just loaded, and os.listdir() just showed a lot of 00's. I did a hard RESET and it booted OK, but all the files were gone.

added:
I have not been able to reproduce this behavior on the feather_nrf52840 -- running eddystone beacon - copying files to FS and running other scripts.

Is there a difference in how the PCA10059 handles this since it does not have an SPI Flash for the FS?

@jerryneedell
Copy link
Collaborator

This PR is a huge improvement over Beta4 in terms of stability for the FS for most boards. While the pca10059 appears to be an exception there may be a separate issue going on there.

In my experience, the PyPortal and feather_nrf5840 are working well and were extremely frustrating with Beta4.
Even if not perfect, I think a release of this update would be helpful.

@uhrheber
Copy link

I tested the pca10059 with a simple main.py, that doesn't use any external libs.
It runs stable. I can even torture the drive with f3write. No errors. The problems seem to be linked to bluetooth here.

With the pca10056, it's completely different.
It runs the bluetooth code without an error.
When I write test data to the drive with f3, while the code is running, the test data shows errors afterwards.
When I write test data with stopped code, I get no errors.

It seems that there are still some timing problems.

@dhalbert
Copy link
Collaborator Author

The difference here is that the PCA10059 uses internal flash for CIRCUITPY, and the PCA10056 uses external. So there's probably a different bug related to that. There's code to use the SoftDevice internal flash writing routines when the SoftDevice is enabled, but clearly there's still some issue. I will look at that.

@dhalbert
Copy link
Collaborator Author

I think I'm going to release a beta 5 immediately with just this fix, since beta 4 is too broken for most people to use.

Copy link
Collaborator

@jerryneedell jerryneedell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested on pyportal, feather_nrf52840 and intsybitsy_m4
Recommend releasing even with remaining issues on pca10059/pca10056

@kevinjwalters
Copy link

Does this post need an edit or a reply on when beta 4 should or should not be used? https://forums.adafruit.com/viewtopic.php?f=60&t=149034

@dhalbert dhalbert merged commit 60f7694 into adafruit:master Mar 17, 2019
@dhalbert dhalbert deleted the reload-flush-filesystem branch March 17, 2019 13:17
@uhrheber
Copy link

If the bug affects the pca10056, then it should also affect the feather_nrf52840, as both are nRF52840 with external flash. Unless it's a SPI timing issue, and the flash chips behave differently.

@jerryneedell
Copy link
Collaborator

so far, I have found the feather_nrf52840 and a particle_argon to both behave well with this PR. I am just doing "normal" use -- running code and copying files to the FS then issuing soft reboot. I have not had any problems.

One related observation - not new but still occasionally occurs.
I have had seen few "odd" behaviors were a newly copied file does not run on the first attempt, but does run after a reboot or recopying - I don't have a clean example of this. This is not a new thing and I have always attributed it to the delayed file writes - I seem to avoid it by always syncing the FS and theOS after a copy.

@uhrheber
Copy link

@jerryneedell Would you mind trying this code?
It needs the libraries adafruit_ble and adafruit_bluefruit_connect. It's written for the pca10059, so you'll have to adapt the LED port pins.

from adafruit_ble.uart import UARTServer
import board, digitalio, pulseio
from adafruit_bluefruit_connect.packet import Packet
from adafruit_bluefruit_connect.color_packet import ColorPacket

ledr = pulseio.PWMOut(board.LED2_R, frequency=5000, duty_cycle=65535)
ledg = pulseio.PWMOut(board.LED2_G, frequency=5000, duty_cycle=65535)
ledb = pulseio.PWMOut(board.LED2_B, frequency=5000, duty_cycle=65535)

ledy = digitalio.DigitalInOut(board.LED1)
ledy.direction = digitalio.Direction.OUTPUT
ledy.value = True

uart_server = UARTServer()

def rgbled(colors):
    ledr.duty_cycle = (255 - colors[0]) << 8
    ledg.duty_cycle = (255 - colors[1]) << 8
    ledb.duty_cycle = (255 - colors[2]) << 8

while True:
    # Advertise when not connected.
    uart_server.start_advertising()
    ledy.value = False
    while not uart_server.connected:
        pass

    while uart_server.connected:
        ledy.value = True
        packet = Packet.from_stream(uart_server)
        if isinstance(packet, ColorPacket):
            print(packet.color) 
            rgbled(packet.color) 

This code wipes itself on the pca10059, but runs on the pca10056, so I guess it'll also run on the feather_nrf52840.

@jerryneedell
Copy link
Collaborator

@uhrheber I made a modified version since the feather_nrf52840 does not have the same LEDs - I just use 1 LED
I have run it on a particle argon - so far without issue.
Does this test it adequately?

from adafruit_ble.uart import UARTServer
import board, digitalio, pulseio
from adafruit_bluefruit_connect.packet import Packet
from adafruit_bluefruit_connect.color_packet import ColorPacket

ledr = pulseio.PWMOut(board.D7, frequency=5000, duty_cycle=65535)


uart_server = UARTServer()

def rgbled(colors):
    ledr.duty_cycle = (255 - colors[0]) << 8

while True:
    # Advertise when not connected.
    uart_server.start_advertising()
    while not uart_server.connected:
        pass

    while uart_server.connected:
        packet = Packet.from_stream(uart_server)
        if isinstance(packet, ColorPacket):
            print(packet.color) 
            rgbled(packet.color) 

@jerryneedell
Copy link
Collaborator

it is also running OK on a feather_nrf52840

@dhalbert
Copy link
Collaborator Author

@uhrheber and @jerryneedell could you open a new issue for the PCA10059 issue, so we can track it? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File system problems and soft reboot hanging
5 participants