The OWASP Benchmark is a test suite designed to evaluate the speed, coverage, and accuracy of automated vulnerability detection tools. Without the ability to measure these tools, it is difficult to understand their value or interpret vendor claims. The Benchmark contains over 20,000 test cases that are fully runnable and exploitable. The following is the scorecard showing how well all the tools perform against Command Injection in version 1.1 of the Benchmark. It shows how well each tool finds true positives and avoids false positives for that type of vulnerability in the Benchmark test cases.
For more information, please visit the OWASP Benchmark Project Site.
| True Positive (TP) | Tests with real vulnerabilities that were correctly reported as vulnerable by the tool |
|---|---|
| False Negative (FN) | Tests with real vulnerabilities that were not correctly reported as vulnerable by the tool |
| True Negative (TN) | Tests with fake vulnerabilities that were correctly not reported as vulnerable by the tool |
| False Positive (FP) | Tests with fake vulnerabilities that were incorrectly reported as vulnerable by the tool |
| True Positive Rate (TPR) = TP / ( TP + FN ) | The rate at which the tool correctly reports real vulnerabilities |
| False Positive Rate (FPR) = FP / ( FP + TN ) | The rate at which the tool incorrectly reports fake vulnerabilities as real |
| Score = TPR - FPR | Normalized distance from the random guess line |