A few of the most significant or interesting bugs found by Atheris. If you found a bug that you believe should be included here, feel free to send a PR.
Pillow is the most popular Python image processing library.
CVE-2020-35653 is a heap buffer overflow that could occur when decoding a malicious PCX-format image, because the decoder used certain "stride" size information from the image header, without verifying that the stride didn't result in reading data outside the image bounds.
Ultrajson is a fast, drop-in replacement to Python's built-in JSON parsing library.
Atheris has found a number of overflow vulnerabilities in Ultrajson, including an overflow of a 64k stack buffer in objToJSON()
, an overflow of a 32k heap buffer in JSON_EncodeObject()
, and several other memory corruption bugs.
Under certain circumstances, passing invalid unicode to the CPython interpreter's PyUnicode_DecodeUTF8Stateful()
would cause a malloc()
of incorrect size rather than returning an error.
A number of differential fuzzers have been written for Python, which can often find parsing bugs when two libraries are designed for parsing the same grammar.
When given numbers that are too big to fit in a 64-bit integer, Ultrajson raises an exception (whereas Python's built-in library can parse them). This is actually permitted by the JSON standard. However, some too-big are decoded - but to the wrong numbers.
The Python idna package and the native libidn2 library are both used for converting Internationalized Domain Names (containing Unicode characters) into the "Punycode" ASCII format actually used by DNS. Because Python has such good Unicode support, the Python idna package does this entirely correctly; however, libidn2 relies on older, outdated Unicode metadata tables. Libidn2 supports Unicode 11, but uses Unicode 9 tables. This conflict results in it decoding some internationalized domain names incorrectly. İ᷹.com, for example. This could result in a domain name that resolves to a different website depending on who accesses it, and (if a legitimate website ever uses such characters), could allow someone to impersonate that website to any tool using libidn2.
Python offers libraries that can safely parse Python code without executing it. However, providing too many curly braces, while totally invalid, causes exponentially increasing DoS in Python 3.10.
Certain malicious inputs can cause infinite recursion in Pygments get_tokens_unprocessed()
function.
These bugs were found by OSS-Fuzz in Python projects via Atheris. (Access to some bugs may be restricted.)