Skip to content

Conversation

dpgeorge
Copy link
Member

Summary

This is split out from #8381. This PR includes just the minimal number of commits from #8381 to get ROMFS support added:

  • the core ROMFS filesystem definition
  • the corresponding VfsRom driver
  • ability to memory map files
  • ability to import .mpy files in-place from the ROMFS
  • enabled on unix and qemu ports
  • a unittest-based test for the functionality

What's missing is integration into each port. That's left for #8381.

Testing

This PR includes a comprehensive unittest for VfsRom that runs under unix and qemu ports, and gets full coverage of the VfsRom driver. It's also been tested as part of #8381.

@dpgeorge dpgeorge added the extmod Relates to extmod/ directory in source label Dec 19, 2024
@dpgeorge dpgeorge requested a review from projectgus December 19, 2024 00:29
Copy link

github-actions bot commented Dec 19, 2024

Code size report:

   bare-arm:    +4 +0.007% 
minimal x86:    +6 +0.003% 
   unix x64: +5240 +0.623% standard[incl +736(data)]
      stm32:    +8 +0.002% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +8 +0.003% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32: +2210 +0.554% VIRT_RV32

Copy link

codecov bot commented Dec 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.59%. Comparing base (4729a89) to head (c732041).
Report is 8 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #16446      +/-   ##
==========================================
+ Coverage   98.57%   98.59%   +0.01%     
==========================================
  Files         164      167       +3     
  Lines       21349    21617     +268     
==========================================
+ Hits        21045    21313     +268     
  Misses        304      304              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dpgeorge
Copy link
Member Author

Ah, there's still two lines of code that are uncovered by the test. Will fix.

@dpgeorge dpgeorge added this to the release-1.25.0 milestone Dec 19, 2024
@robert-hh
Copy link
Contributor

I tried to just add board specific files to this PR. But that did not work to get ROMFS. vfs.VfsRom is there, but at least the vfs.rom_ioctl() method is missing. And of course the changed mpremote.

@robert-hh
Copy link
Contributor

robert-hh commented Dec 19, 2024

This commit is missing:
5349f9207 2024-11-15 extmod/modvfs: Add rom_ioctl function.
Adding it reveals vfs_rom_ioctl(), and adding suitable board files then provide a ROMFS.
Compiling for boards w/o romfs support still works.

@dpgeorge
Copy link
Member Author

Thanks @robert-hh for looking at this. As mentioned in the top comment, I left out the integration with the bare-metal ports. The aim with this PR is to add the VfsRom driver and integration with the import machinery so that .mpy files can be imported in-place. That's all tested with the new test added here, and enabled on unix and qemu for testing.

So this PR is a self-contained piece of work, to finalise the ROMFS filesystem itself and how it works with the memory mapping feature.

Once this is merged, #8381 will be rebased and adds the final pieces of the puzzle to get it working on the ports, and with mpremote.

Copy link
Contributor

@projectgus projectgus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good to me! It's nice and clean, and I like how it's possible to construct a RomFs in RAM inside MicroPython and do tests from that. The unit tests are great example of what's possible with the new unittest support, as well.

While I was reading the code I was trying to think of pathological cases that might perform poorly, but couldn't think of any really. Obviously directories with a huge number of entries will be pretty slow, but there's unlikely to be any of them. Plus I like how you've made the format extensible so additional cache data or something else can be added in, later.

extmod/vfs_rom.c Outdated
const uint8_t *filesystem_top;
};

static mp_uint_t extract_record(const uint8_t **fs, const uint8_t **fs_next) {
Copy link
Contributor

@projectgus projectgus Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't mean much because of C's type system, but it is nice for the reader of code to see "this variable is a record kind" or "this functions returns a record_kind". So I'd suggest typedef-ing the enum as a record_kind_t and flagging the places an mp_uint_t is expected to be a record_kind_t, if possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's possible to use the enum, because the FS should support arbitrary record kinds, eg they might be 32-bit numbers. If the C compiler sees that it's an enum with only a handful of possibilities, it may narrow the enum type to 8 bits, and hence a large and unknown record type might be truncated to 8 bits and alias a known one.

Instead, maybe it's better to change these constants to simple #define's, and then just have a typedef for record_kind_t to be mp_uint_t.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a typedef for record_kind_t and changed the enum to #define's (because it's part of the filesystem spec, IMO better to explicitly list the values for the records).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For once our old friend the C standard has good news! C99 6.4.4.3 Enumeration constants: "An identifier declared as an enumeration constant has type int." The storage size should be the same as mp_uint_t.

There is a gcc option, -fshort-enums that allows the non-standard behaviour you're describing. I just noticed that it is currently turned on for the nrf port (only), but I suggest we disable it there rather than accommodate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our replies crossed, but that seems fine to me too. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storage size should be the same as mp_uint_t.

With nanboxing, mp_uint_t is uint64_t, which is bigger than int.

There is a gcc option, -fshort-enums that allows the non-standard behaviour you're describing. I just noticed that it is currently turned on for the nrf port (only), but I suggest we disable it there rather than accommodate it.

IMO this is a useful compiler option to reduce code size and RAM usage (more likely a benefit for RAM usage) and it would be best not to have to disallow the use of this option just for this bit of code. Eg third party ports may want to use this option, and I don't know how easy it is for us to warn against using it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our replies crossed, but that seems fine to me too. 😄

Haha, OK!

If you think it's fine, let's leave it as a typedef mp_uint_t record_kind_t.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how easy it is for us to warn against using it.

MP_STATIC_ASSERT(sizeof(some_enum) == sizeof(int)) would catch it, but I understand your reasoning. No worries.

break;
case 't':
type = &mp_type_vfs_rom_textio;
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't noticed MicroPython treats invalid mode strings as 'r' before, but I see it's the case for all filesystems. I don't see it documented as a CPython difference, should we document it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's interesting.

Yes we should document this, or maybe fix it (raise an exception for unknown codes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a note to look into this in the future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, great.

Maybe it's worth making a helper function for all filesystems to use, which implements this switch/logic. Then at least the issue at hand is confined to a single location.

)


class VfsRomWriter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a comment for this PR, but did you have any thoughts about how to unify this class and the one that will eventually be added in tools/mpremote/mpremote/romfs.py? Maybe it's a good excuse to have some metadata in the test file that tells run-tests to mount a directory from the host when running it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to think of a good way to do this... but in the end the code in the test will only be the minimal code needed to create the test ROMFS. The code in mpremote's ROMFS implementation will eventually be more sophisticated, eg to support alignment. And probably we should have separate tests for that mpremote code, which tests the ROMFS writer.

Here, the aim is to test the VfsRom driver.

@dpgeorge
Copy link
Member Author

Obviously directories with a huge number of entries will be pretty slow, but there's unlikely to be any of them.

The file lookup code will skip entire directories if the file/directory that's being searched for is not in that directory. So there is some amount of lookup-optimisation here already. But, yes, if a directory contains a lot of files and you want a file in that directory, it will be linear (number of files in the directory) in the search for the file.

This commit defines a new ROMFS filesystem for storing read-only files that
can be memory mapped, and a new VfsRom driver.  Files opened from this
filesystem support the buffer protocol.  This allows naturally getting the
memory-mapped address of the file using:
- memoryview(file)
- uctypes.addressof(file)

Furthermore, if these files are .mpy files then their content can be
referenced in-place when importing.  Such imports take up a lot less RAM
than importing from a normal filesystem.  This is essentially dynamically
frozen .mpy files, building on the revamped v6 .mpy file format.

Signed-off-by: Damien George <damien@micropython.org>
Allows an interned string to reference static/ROM data, instead of
allocating it on the GC heap.

Signed-off-by: Damien George <damien@micropython.org>
This allows accessing data directly in ROM if the reader supports it.

Signed-off-by: Damien George <damien@micropython.org>
If the file can be memory mapped (because it responds to the buffer
protocol) then return a memory-reader that directly references the ROM data
of the file.

Signed-off-by: Damien George <damien@micropython.org>
This adds an optimisation for loading .mpy files from a reader that points
to ROM.  In such a case qstr, str and bytes data, along with bytecode, are
all referenced in-place in ROM.

Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Provides full coverage of the VfsRom driver.

Signed-off-by: Damien George <damien@micropython.org>
@dpgeorge dpgeorge merged commit c732041 into micropython:master Dec 23, 2024
65 checks passed
@dpgeorge dpgeorge deleted the extmod-add-vfs-rom branch December 23, 2024 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extmod Relates to extmod/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants