Skip to content

Ruby: generate overlay discard predicates #19719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: nickrolfe/ruby-overlay-extraction
Choose a base branch
from

Conversation

nickrolfe
Copy link
Contributor

Stacked on top of #19684

@github-actions github-actions bot added the Ruby label Jun 10, 2025
@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-overlay-extraction branch from fb89f22 to 576f0d2 Compare June 11, 2025 10:27
@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-discard-predicates branch from b74138e to 8a33cb4 Compare June 11, 2025 10:29
@nickrolfe nickrolfe marked this pull request as ready for review June 11, 2025 10:31
@Copilot Copilot AI review requested due to automatic review settings June 11, 2025 10:31
@nickrolfe nickrolfe requested a review from a team as a code owner June 11, 2025 10:31
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for overlay-based discard predicates in the Ruby Tree-sitter extractor, enabling selective filtering of AST nodes and files when running on overlay databases.

  • Introduces OverlayAnnotation and PragmaAnnotation in the QL generator to emit overlay/pragma directives.
  • Extends ql_gen to produce new predicates (isOverlay, getRawFile, discardFile, discardableAstNode, discardAstNode).
  • Updates the code generator to conditionally include overlay predicates and adds static definitions in the Ruby QLL.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
shared/tree-sitter-extractor/src/generator/ql_gen.rs Added functions to generate overlay discard predicates.
shared/tree-sitter-extractor/src/generator/ql.rs Added TopLevel::Predicate, PragmaAnnotation, OverlayAnnotation, and Expression::Negation.
shared/tree-sitter-extractor/src/generator/mod.rs Renamed add_metadata_relation to overlay_support and wired in overlay predicates.
ruby/ql/lib/codeql/ruby/ast/internal/TreeSitter.qll Added overlay/pragma annotations and new predicates for Ruby and ERB modules.
Comments suppressed due to low confidence (2)

shared/tree-sitter-extractor/src/generator/ql.rs:271

  • Add a space or newline after the closing doc comment for readability, e.g.: write!(f, "/** {} */ ", qldoc)?; so that overlay[...] is clearly separated.
write!(f, "/** {} */", qldoc)?;

shared/tree-sitter-extractor/src/generator/mod.rs:63

  • Consider adding unit or integration tests to verify that overlay_support correctly causes the metadata relation and overlay predicates to be written.
if overlay_support {

@@ -48,6 +53,38 @@ module Ruby {
final override string getAPrimaryQlClass() { result = "ReservedWord" }
}

/** Gets the file path of the given `node`. */
Copy link
Preview

Copilot AI Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The overlay predicate blocks for Ruby and ERB are nearly identical; consider extracting common definitions (e.g., via a shared include) to reduce duplication.

Copilot uses AI. Check for mistakes.

Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only some cosmetic comments

/** Holds if `node` should be discarded, because it is part of the overlay base and is in a file that was also extracted as part of the overlay database. */
overlay[discard_entity]
pragma[nomagic]
predicate discardAstNode(@ruby_ast_node node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the new predicates should be private since they're not intended to be called from the outside.

/** Gets the file path of the given `node`. */
overlay[local]
pragma[nomagic]
string getRawFile(@ruby_ast_node node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the name getRawFile comes from the Java discard predicates and makes sense there because Java has a translation layer in QL where we wanted to get file names on the raw/input side of it. It doesn't necessarily make sense for Ruby.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the right local annotations elsewhere, we could remove this predicate entirely, and instead make calls like node.(Ruby::AstNode).getLocation().getFile(), but I'd like to minimise my changes for now. I'll change its name to getNodeFile.

/** Gets the file path of the given `node`. */
overlay[local]
pragma[nomagic]
string getRawFile(@ruby_ast_node node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use @file instead of string in these predicates? AFAICT it would make the code simpler and also clearer because the types will be more precise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I suppose I was just cargo-culting the use of string.

@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-discard-predicates branch from 8a33cb4 to acf0fe3 Compare June 11, 2025 11:39
@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-overlay-extraction branch from 576f0d2 to 4029b89 Compare June 11, 2025 11:45
@nickrolfe nickrolfe requested review from a team as code owners June 11, 2025 11:45
@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-discard-predicates branch from acf0fe3 to 4875dcb Compare June 11, 2025 11:45
@nickrolfe
Copy link
Contributor Author

@aibaars if you assume the generated QL is correct, would you be able to review the generator changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants