-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Submit the re, json, csv, & struct modules to oss-fuzz testing #73691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For reference, read https://github.com/google/oss-fuzz. We should investigate creating fuzz targets for the Python re module (_sre.c) at a minimum. There are probably other good targets as well such as _json.c and _csv.c. pickle and marshal are not intended to be secure so ignore those. |
It does not appear to me that targets have to be security critical, though that is certainly a good place to start. The Chrome tests found 100s of "security vulnerabilities and stability bugs". The important thing is that there be someone willing to receive and act on reports. Would 'make public after 90 days' ever be a problem? AFAIK, most Python security issues are already public here on the tracker from day 1. |
Aha, I found an existing issue! For adding to oss-fuzz, is there a contact email we can use that is connected to a google account? I am tempted to just put gregory.p.smith on there if not. :) I can volunteer to fuzz some interesting subset of the stdlib. The list I've come up with (by counting uses in my code) is: the XML parser (which seems to be written in C) I'd also suggest the ast module, since people do use ast.literal_eval on untrusted strings, but I probably won't do that one myself. I wrote a fuzz test for json via upstream simplejson, but the bug on github is getting stale: simplejson/simplejson#163 Should I add it to CPython instead?
If we prioritize based on security risk, I'd argue that this is lower priority than things like json's speedup extension module, because people should generally not pass untrusted strings to the re module: it's very easy to DOS a service with regexes unless you're using RE2 or similar -- which is fuzzed. In contrast, json is supposed to accept untrusted input and people do that very often. (OTOH, I would be willing to bet that fuzzing re will yield more bugs than fuzzing json.) |
you can list me as a oss-fuzz contact. use my work email address. simplejson is worthy but as both it and the python standard library ship separately people use both so they both ultimately deserve fuzzing and fixing on their own so I'd add it to CPython as well. |
google/oss-fuzz#583 is the PR to oss-fuzz to add the project. I'm working on actual tests to be submitted here. |
As I read 583, they are planning to fuzz 3.6. Why not branch master? I think it more likely that we accidentally add a vulnerability to master then that we accidentally close one. |
I think they misspoke, it's normal with fuzzing to test against master. The current draft of the code runs this git pull before building/launching any tests:
Speaking of which, I forgot to update this bug thread with the followup PR to actually run CPython's fuzz tests (when they exist): google/oss-fuzz#731. That's where I grabbed the git clone statement from. I think that will be merged after some version of PR 2878 lands in CPython (still in code review / broken). For Python 2 I guess it's different, and we will test against the 2.7 branch, right? |
alright, with that in, feel free to figure out the oss-fuzz configuration side and fire things up Devin. :) |
GCC complains about the patch: /home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c: In function ‘LLVMFuzzerTestOneInput’:
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:109:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
#if _Py_FUZZ_YES(fuzz_builtin_float)
^~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:109:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:112:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
#if _Py_FUZZ_YES(fuzz_builtin_int)
^~~~~~~~~~~~~~~~~~~~~~~~~
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:112:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:115:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
#if _Py_FUZZ_YES(fuzz_builtin_unicode)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:115:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined] |
Huh. I would not have predicted that. https://gcc.gnu.org/onlinedocs/cpp/Defined.html I'll send a fix. |
So here's an interesting issue: oss-fuzz requires that the built location be movable. IOW, we build Python into $OUT, and then the $OUT directory gets moved somewhere else and the fuzz test gets run from there. This causes problems because Python can no longer find where the modules it needs are (encodings for example). First thought: wouldn't it be nice if we could make a prepackaged and hermetic executable that we can move around freely? Second thought: isn't that "Hermetic Python", as used within Google? Third thought: doesn't Google have an internal fuzz testing environment we can use, instead of oss-fuzz? So unless someone says this is a bad idea, I'd propose we not run these in oss-fuzz and instead run them in Google proper. The alternative is if there's a way to make it easy to move Python around -- is there a way to build it s.t. the import path is relative and so on? |
i'd rather make this work in oss-fuzz on cpython. can you point me to how oss-fuzz works and what it wants to do so i can better understand what it needs? it it has an expectation that the thing being fuzzed is a single binary with no data or directory tree layout dependencies that can just be plopped somewhere else is not a great assumption to make. but environment variables _should_ be able to be set to point the python binary at what it needs if it must work that way. |
I don't have any details except for what's in the PR to oss-fuzz (google/oss-fuzz#731) My understanding is matches what you've said so far: Python is built to one directory (/out/), but then needs to be run from another directory (/out/ is renamed to /foo/bar/baz/out/). We need python to still work. I have no idea how to do this. The only suggestion on #python-dev IRC was to statically link a libpython.a, but this doesn't avoid needing to import libraries like "encodings" dynamically, so they still need to be locatable on disk. Is there a way to build python so that it doesn't use absolute paths to everything, and so that the install can be moved at will? Or is there a way to tell it that it was moved at runtime? (I am unconvinced PYTHONPATH is a maintainable solution, if it works at all...) oss-fuzz is not going to change away from its model (I asked if they could, they said no), so we're stuck with making Python compatible with it one way or another. This is why I am so drawn to running the test internally on Google's infrastructure anyway: we already _did_ all this work already, via hermetic python. Doing it a second time, but worse, seems annoying. |
kcc strongly disagrees though. Copying latest comment: """ We must figure out how to get this to build and run on the external oss-fuzz infrastructure |
misquote. that was me objecting to running it internally. :) i believe this is solvable, i haven't had time to spend on this part yet. |
Oops, so it is. I can't read apparently. I'll spend my time on making more fuzz tests in the meantime. |
Seems like it ought to be possible to use the same hooks that venv uses to make this work, but I haven't looked at the details of how those work. |
Hi, I've built a generic Python fuzzer and submitted it to OSS-Fuzz. It works by implementing a "def FuzzerRunOne(FuzzerInput):" function in Python in which some arbitrary code is run based on FuzzerInput, which is a bytes object. This is a more versatile solution than the current re, json, csv fuzzers as it requires no custom C code and adding more fuzzing targets is as easy as writing a new harness in Python and adding a build rule. Code coverage is measured at both the CPython level (.c) and the Python level (.py). CPython is compiled with AddressSanitizer. What this means is that both CPython memory bugs and Python library bugs (excessive memory consumption, hangs, slowdowns, unexpected exceptions) are expected to transpire. You can see my current set of fuzzers here: https://github.com/guidovranken/python-library-fuzzers The PR to OSS-Fuzz is google/oss-fuzz#2567 Currently, the only Python maintainer who will be receiving automated bug reports is gpshead. Are there any other developers who normally process Python security bug reports and would like to receive notifications? Feel free to respond directly in the OSS-Fuzz PR thread. |
All the modules prescribed in the original bug are now actively being fuzzed, thank you to Devin for the original work on this and Gregory for reviewing all these changes. I'm closing this now, feel free to make a new bug and nosy me in if there are any other C-based modules commonly exposed to user input that should be fuzzed. |
Now that our int<->str conversions are size limited and we have the _pylong module handling larger integers, we don't need to limit everything just to avoid wasting time in the quadratic time DoS-like case while fuzzing. We can tweak these further after seeing how this goes.
In anybody has a clone of the, now former https://github.com/guidovranken/python-library-fuzzers repo. Please let us know. The author has deleted their project. |
@gpshead has anyone emailed Guido? If not, I'm happy to. |
I emailed them earlier today. But we've since turned up copies so I think we're all good now to figure out if we want to keep running that. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: