You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* add definitions for is_number and tie (by lemire)
* add fuzzer for element
* update fuzz documentation
* fix UB in creating an empty padded string
* don't bother null terminating padded_string, it is done by the std::memset already
* refactor fuzz data splitting into a separate class
Copy file name to clipboardExpand all lines: fuzz/Fuzzing.md
+43-44Lines changed: 43 additions & 44 deletions
Original file line number
Diff line number
Diff line change
@@ -9,40 +9,58 @@
9
9
10
10
The simdjson library tries to follow [fuzzing best practises](https://google.github.io/oss-fuzz/advanced-topics/ideal-integration/#summary).
11
11
12
-
The simdjson library is continuously fuzzed on [oss-fuzz](https://github.com/google/oss-fuzz). In case a bug is found, the offending input is minimized and tested for reproducibility. A report with the details is automatically filed, and the contact persons at simdjson are notified via email. An issue is opened at the oss-fuzz bugtracker with restricted view access. When the bug is fixed, the issue is automatically closed.
12
+
There is both "normal" fuzzers just feeding the api with fuzz data, as well as **differential** fuzzers. The differential fuzzers feed the same data to the multiple implementations (haswell, westmere and fallback) and ensure the same results are achieved. This makes sure the user will always get the same answer regardless of which implementation is in use.
13
13
14
-
Bugs are automatically made visible to the public after a period of time. An example of a bug that was found, fixed and closed can be seen here: [oss-fuzz 18714](https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714).
14
+
The fuzzers are used in several ways.
15
15
16
+
* local fuzzing - for developers testing their changes before pushing and/or during development of the fuzzers themselves.
17
+
* CI fuzzing - for weeding out those easy to find bugs in pull requests, before they are merged.
18
+
* oss-fuzz - heavy duty 24/7 fuzzing provided by the google driven oss-fuzz project
16
19
17
-
## Currently open bugs
20
+
## Local fuzzing
21
+
Just invoke fuzz/quick_check.sh, it will download the latest corpus from bintray (kept up to date by the CI fuzzers) and run the fuzzers for a short time. In case you want to run the fuzzers for longer, modify the timeout value in the script or invoke the fuzzer directly.
18
22
23
+
This requires linux with clang and cmake installed (recent Debian and Ubuntu are known to work fine).
19
24
20
-
You can find the currently opened bugs, if any at [bugs.chromium.org](https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&q=proj%3Asimdjson&can=2): make sure not to miss the "Open Issues" selector. Bugs that are fixed by follow-up commits are automatically closed.
25
+
It is also possible to run the full oss-fuzz setup by following [these oss-fuzz instructions](https://google.github.io/oss-fuzz/getting-started/new-project-guide/#testing-locally) with PROJECT_NAME set to simdjson. You will need rights to run docker.
21
26
22
-
## Integration with oss-fuzz
27
+
## Fuzzing as a CI job - x64
23
28
24
-
Changes to the integration with oss-fuzz are made by making pull requests against the oss-fuzz github repo. An example can be seen at [oss-fuzz pull request 3013](https://github.com/google/oss-fuzz/pull/3013).
29
+
There is a CI job which builds and runs the fuzzers. This is aimed to catch the "easy to fuzz" bugs quickly, without having to wait until pull requests are merged and eventually built and run by oss-fuzz.
25
30
26
-
As little code as possible is kept at oss-fuzz since it is inconvenient to change. The [oss-fuzz build script](https://github.com/google/oss-fuzz/blob/b96dd54183f727a5d90c786e0fb01ec986c74d30/projects/simdjson/build.sh#L18) invokes [the script from the simdjson repo](https://github.com/simdjson/simdjson/blob/master/fuzz/ossfuzz.sh).
31
+
The CI job does the following
32
+
- builds a fast fuzzer, with full optimization but less checks which is good at rapidly exploring the input space
33
+
- builds a heavily sanitized fuzzer, which is good at detecting errors
34
+
- downloads the stored corpus
35
+
- runs the fast fuzzer build for a while, to grow the corpus
36
+
- runs the sanitizer fuzzer for a while, using the input found by the fast fuzzer
37
+
- using a reproduce build (uninstrumented), executes a subset of the test cases in the corpus through valgrind
38
+
- minimizes the corpus and uploads it (if on the master branch)
39
+
- stores the corpus and valgrind output as artifacts
27
40
41
+
The job is available under the actions tab, here is a [direct link](https://github.com/simdjson/simdjson/actions?query=workflow%3A%22Fuzz+and+run+valgrind%22).
28
42
43
+
The corpus will grow over time and easy to find bugs will be detected already during the pull request stage. Also, it will keep the fuzzer builds from bit rot.
29
44
30
-
## Fuzzing as a CI job
45
+
## Fuzzing as a CI job - arm64
46
+
There is also a job running the fuzzers on arm64 (see .drone.yml) to make sure also the arm specific parts are fuzzed. This does not update the corpus, it just reuses what the x64 job finds.
31
47
32
-
There is a CI job which builds and runs the fuzzers. This is aimed to catch the "easy to fuzz" bugs quickly, without having to wait until pull requests are merged and eventually built and run by oss-fuzz.
48
+
## Fuzzing on oss-fuzz
49
+
The simdjson library is continuously fuzzed on [oss-fuzz](https://github.com/google/oss-fuzz). In case a bug is found, the offending input is minimized and tested for reproducibility. A report with the details is automatically filed, and the contact persons at simdjson are notified via email. An issue is opened at the oss-fuzz bugtracker with restricted view access. When the bug is fixed, the issue is automatically closed.
33
50
34
-
The CI job does the following
35
-
- builds several variants (with/without avx, with/without sanitizers, a fast fuzzer)
36
-
- downloads the stored corpus
37
-
- runs the fastest fuzzer build for 30 seconds, to grow the corpus
38
-
- runs each build variant for 10 seconds on each fuzzer
39
-
- using a reproduce build (uninstrumented), executes all the test cases in the corpus through valgrind
40
-
- minimizes the corpus and upload it (if on the master branch)
41
-
- store the corpus and valgrind output as artifacts
51
+
Bugs are automatically made visible to the public after a period of time. An example of a bug that was found, fixed and closed can be seen here: [oss-fuzz 18714](https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714).
42
52
43
-
The job is available under the actions tab, here is a [direct link](https://github.com/simdjson/simdjson/actions?query=workflow%3A%22Run+fuzzers+on+stored+corpus+and+test+it+with+valgrind%22).
44
53
45
-
The corpus will grow over time and easy to find bugs will be detected already during the pull request stage. Also, it will keep the fuzzer builds from bit rot.
54
+
## Currently open bugs
55
+
56
+
You can find the currently open bugs (if any) at [bugs.chromium.org](https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&q=proj%3Asimdjson&can=2): make sure not to miss the "Open Issues" selector. Bugs that are fixed by follow-up commits are automatically closed.
57
+
58
+
## Integration with oss-fuzz
59
+
60
+
Changes to the integration with oss-fuzz are made by making pull requests against the oss-fuzz github repo. An example can be seen at [oss-fuzz pull request 3013](https://github.com/google/oss-fuzz/pull/3013).
61
+
62
+
As little code as possible is kept at oss-fuzz since it is inconvenient to change. The [oss-fuzz build script](https://github.com/google/oss-fuzz/blob/b96dd54183f727a5d90c786e0fb01ec986c74d30/projects/simdjson/build.sh#L18) invokes [the script from the simdjson repo](https://github.com/simdjson/simdjson/blob/master/fuzz/ossfuzz.sh).
63
+
46
64
47
65
## Corpus
48
66
@@ -55,32 +73,13 @@ One can also grab the corpus as an artifact from the github actions job. Pick a
55
73
The code coverage from fuzzing is most easily viewed on the [oss-fuzz status panel](https://oss-fuzz.com/fuzzer-stats). Viewing the coverage does not require login, but the direct link is not easy to find. Substitute the date in the URL to get a more recent link:
Keeping the coverage up is a never ending job. See [issue 368](https://github.com/simdjson/simdjson/issues/368)
58
77
59
-
## Running the fuzzers locally
60
-
61
-
This has only been tested on Linux (Debian and Ubuntu are known to work).
62
-
63
-
Make sure you have clang and cmake installed.
64
-
The easiest way to get started is to run the following, standing in the root of the checked out repo:
65
-
```
66
-
fuzz/build_like_ossfuzz.sh
67
-
```
68
-
69
-
Then invoke a fuzzer as shown by the following example:
70
-
```
71
-
mkdir -p out/parser
72
-
build/fuzz/fuzz_parser out/parser/
73
-
```
74
-
75
-
You can also use the more extensive fuzzer build script to get a variation of builds by using
76
-
```
78
+
## Reproducing
79
+
To reproduce a test case, use the local build instruction. Then invoke the fuzzer (the fuzz_parser is shown as an example below) with the testcase as a command line argument:
In case this does not reproduce the bug, you may want to proceed with reproducing using the oss-fuzz tools. See the instructions [here](https://google.github.io/oss-fuzz/advanced-topics/reproducing/).
79
85
80
-
It is also possible to run the full oss-fuzz setup by following [these oss-fuzz instructions](https://google.github.io/oss-fuzz/getting-started/new-project-guide/#testing-locally) with PROJECT_NAME set to simdjson. You will need rights to run docker.
81
-
82
-
## Reproducing
83
-
To reproduce a test case, build the fuzzers, then invoke it with the testcase as a command line argument:
0 commit comments