Improving auto-generated dictionary of Cmplog #2493
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background: We observed that on FuzzBench, Honggfuzz performs better than AFL++ on the proj4 benchmark (for example, here). Over the past month, we have been investigating the reasons and attempting to further improve AFL++.
For proj4, we noticed that Honggfuzz’s auto-generated dictionary is of much higher quality than AFL++’s. Honggfuzz employs a simple strategy when collecting string constants for its dictionary: during
memcmp
/strcmp
operations, it checks whether one of the pointers originates from the ELF file’s mapped memory (typically writable or non-writable data sections), excluding pointers from the heap. For example, in string constant comparisons, the data being compared usually points to the heap, while the string constant pointer points to the ELF’s read-only data section. When porting this to AFL++, we marked this information in an unused field of the cmplog’srtn
entry.We initially ported Honggfuzz’s strategy (Version 1,
aflplusplus_hfdictv1
), but the generated dictionary still contained many low-quality bytes. We modified AFL++ to log auto-dictionary additions per code location (Here), revealing thatlocation3
andlocation4
(link) added numerous suboptimal dictionary entries, typically 32 bytes in size.AFL++’s cmplog will even instruments functions with signatures similar to
memcmp
/strcmp
(and also ignoring the size from 3rd arg, see this code), it directly records the maximum 32 bytes from the memory pointed to by the first two arguments as a cmplogrtn
entry when encountering such functions. The related condition checks (link) are quite loose, and over 90% of auto-dictionary entries originate there. After further filtering these entries, we achieved better results (Version 2,aflplusplus_hfdictv2
).Local FuzzBench Results
Fuzzers:
aflplusplus_hfdictv1
: First version with Honggfuzz’s auto-dictionary logic.aflplusplus_hfdictv2
: Filtered out 32-byte dictionaries from cmplogrtn
entries.aflplusplus_proj4dict
: Auto-dictionary entries extracted from a 23-hour Honggfuzz instance (converted and fed to AFL++).honggfuzz_orig
: Original Honggfuzz from FuzzBench.aflplusplus_recent
: Recent stable AFL++ version.On the
proj4_proj_crs_to_crs_fuzzer
benchmark, AFL++ now performs as good as Honggfuzz:hfdict-base-aflpphfdictv2-23h.zip
Testing across 4 other benchmarks also showed improvements on some other benchmarks:
report-5-benchmarks.zip
We are also requesting public FuzzBench experiments to further validate the results across more benchmarks (experiments request PR link).
unused
field incmpfn_operands