Skip to content

py/parse: Zero dangling parse tree pointer to fix possible leak. #14136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

projectgus
Copy link
Contributor

This fixes a bug where a random Python object may become un-garbage-collectable until an enclosing Python file (compiled on device) finishes executing:

Details

The mp_parse_tree_t structure is stored on the stack in top-level functions such as parse_compile_execute() in pyexec.c (and others).

Although it quickly falls out of scope in these functions, it is usually still in the current stack frame when the compiled code executes. (Compiler dependent, but usually it's one stack push per function.)

This means if any Python object happens to allocate at the same address as the (freed) root parse tree chunk, it's un-garbage-collectable as there's a (dangling) pointer up the stack referencing this same address.

As reported by @GitHubsSilverBullet here:
https://github.com/orgs/micropython/discussions/14116#discussioncomment-8837214

This work was funded through GitHub Sponsors.

@projectgus projectgus added the py-core Relates to py/ directory in source label Mar 20, 2024
Copy link

codecov bot commented Mar 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.39%. Comparing base (9d27183) to head (a6973ef).

❗ Current head a6973ef differs from pull request most recent head 71044a4. Consider uploading reports for the commit 71044a4 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #14136   +/-   ##
=======================================
  Coverage   98.39%   98.39%           
=======================================
  Files         161      161           
  Lines       21199    21200    +1     
=======================================
+ Hits        20859    20860    +1     
  Misses        340      340           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Code size report:

   bare-arm:    +4 +0.007% 
minimal x86:    +8 +0.004% 
   unix x64:   +24 +0.003% standard
      stm32:    +8 +0.002% PYBV10
     mimxrt:    +8 +0.002% TEENSY40
        rp2:    +0 +0.000% RPI_PICO
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS

@projectgus
Copy link
Contributor Author

I did a quick audit of other functions calling m_del in similar context and didn't find any obvious ones, however I think it's possible for something similar to happen with any normal C automatic variables:

  1. Function A allocates a 16 byte stack frame, uses 3 words of it.
  2. The highest word is a pointer to the heap which is freed.
  3. Function A returns.
  4. Function B is called, allocates a 16 byte stack frame (same address on the stack as A), only uses 2 words of it, calls into child function C.
  5. The pointer from step 2 is still present in valid stack memory from now until function B returns.

I think the only way around this reliably would be a m_free_and_null(p) function that nulls out the argument pointer. May be worth considering?

@projectgus
Copy link
Contributor Author

This means if any Python object happens to allocate at the same address

It's not even "at" same address I realise, it only has to overlap the dangling pointer address.

This fixes a bug where a random Python object may become
un-garbage-collectable until an enclosing Python file (compiled on device)
finishes executing.

Details:

The mp_parse_tree_t structure is stored on the stack in top-level functions
such as parse_compile_execute() in pyexec.c (and others).

Although it quickly falls out of scope in these functions, it is usually
still in the current stack frame when the compiled code executes. (Compiler
dependent, but usually it's one stack push per function.)

This means if any Python object happens to allocate at the same address as
the (freed) root parse tree chunk, it's un-garbage-collectable as there's a
(dangling) pointer up the stack referencing this same address.

As reported by @GitHubsSilverBullet here:
https://github.com/orgs/micropython/discussions/14116#discussioncomment-8837214

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
@dpgeorge dpgeorge force-pushed the bugfix/dangling_parse_pointer branch from a6973ef to 71044a4 Compare March 21, 2024 23:49
@dpgeorge dpgeorge merged commit 71044a4 into micropython:master Mar 21, 2024
58 checks passed
@dpgeorge
Copy link
Member

Thank you!

I think the only way around this reliably would be a m_free_and_null(p) function that nulls out the argument pointer. May be worth considering?

Maybe...

@dpgeorge
Copy link
Member

I think the only way around this reliably would be a m_free_and_null(p) function that nulls out the argument pointer. May be worth considering?

That wouldn't have helped this case if it were written as:

    m_free_and_null(&chunk);

because chunk is a local variable. Instead it would have needed some more sophisticated code for the un-link list logic.

@projectgus
Copy link
Contributor Author

@dpgeorge good point, so yeah maybe it's OK to do it piecemeal if/when these things come up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants