-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
extmod: add ROMFS filesystem and VfsRom driver #16446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Code size report:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #16446 +/- ##
==========================================
+ Coverage 98.57% 98.59% +0.01%
==========================================
Files 164 167 +3
Lines 21349 21617 +268
==========================================
+ Hits 21045 21313 +268
Misses 304 304 ☔ View full report in Codecov by Sentry. |
Ah, there's still two lines of code that are uncovered by the test. Will fix. |
817a342
to
86d37c9
Compare
I tried to just add board specific files to this PR. But that did not work to get ROMFS. vfs.VfsRom is there, but at least the vfs.rom_ioctl() method is missing. And of course the changed mpremote. |
This commit is missing: |
86d37c9
to
6e86086
Compare
Thanks @robert-hh for looking at this. As mentioned in the top comment, I left out the integration with the bare-metal ports. The aim with this PR is to add the So this PR is a self-contained piece of work, to finalise the ROMFS filesystem itself and how it works with the memory mapping feature. Once this is merged, #8381 will be rebased and adds the final pieces of the puzzle to get it working on the ports, and with mpremote. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good to me! It's nice and clean, and I like how it's possible to construct a RomFs in RAM inside MicroPython and do tests from that. The unit tests are great example of what's possible with the new unittest
support, as well.
While I was reading the code I was trying to think of pathological cases that might perform poorly, but couldn't think of any really. Obviously directories with a huge number of entries will be pretty slow, but there's unlikely to be any of them. Plus I like how you've made the format extensible so additional cache data or something else can be added in, later.
extmod/vfs_rom.c
Outdated
const uint8_t *filesystem_top; | ||
}; | ||
|
||
static mp_uint_t extract_record(const uint8_t **fs, const uint8_t **fs_next) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't mean much because of C's type system, but it is nice for the reader of code to see "this variable is a record kind" or "this functions returns a record_kind". So I'd suggest typedef-ing the enum as a record_kind_t
and flagging the places an mp_uint_t
is expected to be a record_kind_t
, if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's possible to use the enum, because the FS should support arbitrary record kinds, eg they might be 32-bit numbers. If the C compiler sees that it's an enum with only a handful of possibilities, it may narrow the enum type to 8 bits, and hence a large and unknown record type might be truncated to 8 bits and alias a known one.
Instead, maybe it's better to change these constants to simple #define's, and then just have a typedef for record_kind_t
to be mp_uint_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a typedef for record_kind_t
and changed the enum to #define's (because it's part of the filesystem spec, IMO better to explicitly list the values for the records).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For once our old friend the C standard has good news! C99 6.4.4.3 Enumeration constants: "An identifier declared as an enumeration constant has type int." The storage size should be the same as mp_uint_t.
There is a gcc option, -fshort-enums
that allows the non-standard behaviour you're describing. I just noticed that it is currently turned on for the nrf port (only), but I suggest we disable it there rather than accommodate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our replies crossed, but that seems fine to me too. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The storage size should be the same as mp_uint_t.
With nanboxing, mp_uint_t
is uint64_t
, which is bigger than int
.
There is a gcc option,
-fshort-enums
that allows the non-standard behaviour you're describing. I just noticed that it is currently turned on for the nrf port (only), but I suggest we disable it there rather than accommodate it.
IMO this is a useful compiler option to reduce code size and RAM usage (more likely a benefit for RAM usage) and it would be best not to have to disallow the use of this option just for this bit of code. Eg third party ports may want to use this option, and I don't know how easy it is for us to warn against using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our replies crossed, but that seems fine to me too. 😄
Haha, OK!
If you think it's fine, let's leave it as a typedef mp_uint_t record_kind_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how easy it is for us to warn against using it.
MP_STATIC_ASSERT(sizeof(some_enum) == sizeof(int))
would catch it, but I understand your reasoning. No worries.
break; | ||
case 't': | ||
type = &mp_type_vfs_rom_textio; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't noticed MicroPython treats invalid mode strings as 'r'
before, but I see it's the case for all filesystems. I don't see it documented as a CPython difference, should we document it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's interesting.
Yes we should document this, or maybe fix it (raise an exception for unknown codes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a note to look into this in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, great.
Maybe it's worth making a helper function for all filesystems to use, which implements this switch/logic. Then at least the issue at hand is confined to a single location.
) | ||
|
||
|
||
class VfsRomWriter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a comment for this PR, but did you have any thoughts about how to unify this class and the one that will eventually be added in tools/mpremote/mpremote/romfs.py
? Maybe it's a good excuse to have some metadata in the test file that tells run-tests to mount a directory from the host when running it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to think of a good way to do this... but in the end the code in the test will only be the minimal code needed to create the test ROMFS. The code in mpremote's ROMFS implementation will eventually be more sophisticated, eg to support alignment. And probably we should have separate tests for that mpremote code, which tests the ROMFS writer.
Here, the aim is to test the VfsRom
driver.
The file lookup code will skip entire directories if the file/directory that's being searched for is not in that directory. So there is some amount of lookup-optimisation here already. But, yes, if a directory contains a lot of files and you want a file in that directory, it will be linear (number of files in the directory) in the search for the file. |
This commit defines a new ROMFS filesystem for storing read-only files that can be memory mapped, and a new VfsRom driver. Files opened from this filesystem support the buffer protocol. This allows naturally getting the memory-mapped address of the file using: - memoryview(file) - uctypes.addressof(file) Furthermore, if these files are .mpy files then their content can be referenced in-place when importing. Such imports take up a lot less RAM than importing from a normal filesystem. This is essentially dynamically frozen .mpy files, building on the revamped v6 .mpy file format. Signed-off-by: Damien George <damien@micropython.org>
Allows an interned string to reference static/ROM data, instead of allocating it on the GC heap. Signed-off-by: Damien George <damien@micropython.org>
This allows accessing data directly in ROM if the reader supports it. Signed-off-by: Damien George <damien@micropython.org>
If the file can be memory mapped (because it responds to the buffer protocol) then return a memory-reader that directly references the ROM data of the file. Signed-off-by: Damien George <damien@micropython.org>
This adds an optimisation for loading .mpy files from a reader that points to ROM. In such a case qstr, str and bytes data, along with bytecode, are all referenced in-place in ROM. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Provides full coverage of the VfsRom driver. Signed-off-by: Damien George <damien@micropython.org>
b5b1ca0
to
c732041
Compare
Summary
This is split out from #8381. This PR includes just the minimal number of commits from #8381 to get ROMFS support added:
What's missing is integration into each port. That's left for #8381.
Testing
This PR includes a comprehensive unittest for VfsRom that runs under unix and qemu ports, and gets full coverage of the VfsRom driver. It's also been tested as part of #8381.