Skip to content

unix: Add aflplusplus variant for fuzzing. #17814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jepler
Copy link
Contributor

@jepler jepler commented Aug 2, 2025

This also includes #17813

Summary

Recently I found and reported a number of core issues found by the AFLplusplus fuzzer. This PR introduces a variant that is useful for running under AFLplusplus, and provides a small overview of the process of actually running the fuzzer.

Testing

I've run the fuzzing build locally for a few dozen CPU hours and reported most of the interesting findings.

Trade-offs and Alternatives

There are other fuzzers out there. It would probably be infeasible to support them all. AFL is the one I started with, and today its successor AFLplusplus is an active project.

Copy link

codecov bot commented Aug 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (master@255d74b). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff            @@
##             master   #17814   +/-   ##
=========================================
  Coverage          ?   98.38%           
=========================================
  Files             ?      171           
  Lines             ?    22283           
  Branches          ?        0           
=========================================
  Hits              ?    21924           
  Misses            ?      359           
  Partials          ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

github-actions bot commented Aug 2, 2025

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

@jepler
Copy link
Contributor Author

jepler commented Aug 2, 2025

Are crashes in @micropython.native decorated functions of interest? If not, I'll disable it in this variant.

This crashes, but would be an UnboundLocalError in standard Python:

@micropython.native
def f1(n):
    for i in range(i):
        print(i)


f1(4)

@micropython.viper looks like another license to crash, I'll disable it. Here's one that looks more interesting than accessing arbitrary memory, though:

@micropython.viper
def f(dest_in):
    6(dest_in)

This hits an assertion error:

micropython: ../../py/emitnative.c:2846: void emit_native_call_function(emit_t *, mp_uint_t, mp_uint_t, mp_uint_t): Assertion `vtype_fun == VTYPE_PYOBJ' failed.

There are more assertions of that type to hit:

micropython: ../../py/emitnative.c:2747: void emit_native_unpack_sequence(emit_t *, mp_uint_t): Assertion `vtype_base == VTYPE_PYOBJ' failed.
micropython: ../../py/emitnative.c:1497: void emit_native_load_attr(emit_t *, qstr): Assertion `vtype_base == VTYPE_PYOBJ' failed.
micropython: ../../py/emitnative.c:1497: void emit_native_load_attr(emit_t *, qstr): Assertion `vtype_base == VTYPE_PYOBJ' failed.


1. Install AFLplusplus so that the program `afl-cc` is on $PATH
1. `cd ports/unix && make VARIANT=aflplusplus -j$(nproc)`
1. Gather your inputs (e.g., from test cases in `tests`)
Copy link
Contributor

@dlech dlech Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you gather inputs?

Maybe this obvious to anyone who has used a fuzzer before, in which case, we don't need any more details here.

Since I saw all of the bugs you have been posting recently, I was curious what your process was since I've never done anything like that before. So nice to see this PR to shed a little light on how to at least get started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fine question to ask. What I wrote above was pretty terse!

As you probably know by now, a fuzzer creates new test cases largely by making modifications to existing test cases according to various strategies. Then, it runs the test case in the hopes of seeing new behavior or especially a crash.

The Python files in MicroPython's tests directory are a great starting point, because they give the fuzzer a set of inputs that can reach a large variety of spots in the MicroPython source -- Around 98% of source code lines, according to the coverage report.

On the other hand, some tests are not good for fuzzing. For instance, it's highly desirable that the fuzzing tests are quick. Like, millisecond quick. Any test that calls "sleep" with a non-zero argument, for instance, is probably slow.

In various runs I've done things like use all tests in basics/, all tests in extmod/, etc. I've used grep with a pattern like 'sleep.*[1-9]' to find and remove scripts that sleep. Lengthy scripts and "stress" scripts are also good ideas to remove. I've also used afl-cmin according to AFLplusplus directions to remove tests that do not add any "coverage" (e.g., if test3.py doesn't actually reach any paths in the C code that test1.py and test2.py also reach, then remove test3.py entirely)

Usually an example of a crash from AFLplusplus is much bigger than it strictly needs to be. And the fuzzer might discover many variations on the same theme -- For instance, in the recent run that turned up #17815 there were probably 50+ variations and one of them was 144 lines of code, where the human-minimized version was just 2 lines. The afl-tmin program from AFLplusplus can also minimize a test case, but I haven't been using it lately.

jepler added 3 commits August 5, 2025 15:16
This is useful to turn off when fuzzing, as improper use of these
typecodes can crash MicroPython.

Signed-off-by: Jeff Epler <jepler@gmail.com>
Under `afl-cc` (acting as a wrapper for clang), the following
diagnostic occurs (wrapped for clarity):
```
../../py/objint_longlong.c:232:32: error:
    comparison of integers of different signs:
    'long long' and 'unsigned long' [-Werror,-Wsign-compare]
```

Add a cast to silence it. The value is known statically to fit inside
`long long`.

Signed-off-by: Jeff Epler <jepler@gmail.com>
Signed-off-by: Jeff Epler <jepler@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants