Skip to content

Rust: Adjust the taint reach metric for better stability. #19718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 11, 2025

Conversation

geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jun 10, 2025

Adjust the taint reach metric for better stability.

When models are added the number of flow summary nodes generated often increases, and we've recently found this tends to cause the taint reach metric to wobble (even when the amount of actual flow and regular nodes associated with the database remains constant). We now exclude flow summary nodes from both the numerator and denominator of the taint reach calculation. I have locally verified this makes the taint reach number stable under deletion (and therefore also addition) of irrelevant summary models.

I expect to see [hopefully quite small] wobbles in the taint reach calculation in the DCA run for this change.

@Copilot Copilot AI review requested due to automatic review settings June 10, 2025 16:38
@geoffw0 geoffw0 added the Rust Pull requests that update Rust code label Jun 10, 2025
@geoffw0 geoffw0 requested a review from a team as a code owner June 10, 2025 16:38
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refines the taint reach metric by excluding flow summary nodes from both the numerator and denominator, stabilizing the calculation when models change.

  • Import FlowSummaryNode to identify summary nodes.
  • Update getTaintedNodesCount and getTaintReach to filter out summary nodes and introduce getTotalNodesCount.
  • Expose the new “total non-summary nodes” statistic in taintStats.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
rust/ql/src/queries/summary/TaintReach.qll Import FlowSummaryNode, filter it out in tainted/total counts.
rust/ql/src/queries/summary/Stats.qll Add new stat key for total non-summary nodes in taintStats.
Comments suppressed due to low confidence (3)

rust/ql/src/queries/summary/TaintReach.qll:39

  • There aren’t any existing tests covering getTotalNodesCount or the new stat; consider adding or updating tests to validate its behavior and catch regressions.
int getTotalNodesCount() { result = count(DataFlow::Node n | not n instanceof FlowSummaryNode) }

rust/ql/src/queries/summary/Stats.qll:192

  • [nitpick] The label "total non-summary nodes" doesn’t exactly match the method name getTotalNodesCount; consider renaming it to something like "total nodes (excluding summary)" for clarity and consistency.
  key = "Taint reach - total non-summary nodes" and value = getTotalNodesCount()

rust/ql/src/queries/summary/Stats.qll:192

  • Indentation for this new entry should align with the existing 4-space indent on other key = ... lines to maintain consistent formatting.
  key = "Taint reach - total non-summary nodes" and value = getTotalNodesCount()

@geoffw0
Copy link
Contributor Author

geoffw0 commented Jun 11, 2025

DCA looks as expected (apart from one spurious build failure)

@geoffw0 geoffw0 merged commit bd21a03 into github:main Jun 11, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants