-
-
Notifications
You must be signed in to change notification settings - Fork 253
poc: use oxc as alternative parser #1970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis update introduces a new Rust crate for integrating Oxc (a JavaScript/TypeScript parser) with the ast-grep framework. It adds a Cargo manifest, implements Oxc language bindings, provides AST node and document abstractions, and supplies a command-line tool for searching AST patterns in source files. Additionally, default trait method implementations are added to the core SgNode trait. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI (main.rs)
participant OxcDoc
participant OxcLang
participant AstGrep
participant Pattern
User->>CLI (main.rs): Provide file path & pattern
CLI (main.rs)->>OxcLang: Create language from file extension
CLI (main.rs)->>OxcDoc: Parse source text to AST document
OxcDoc->>OxcLang: Use parser & semantic analysis
CLI (main.rs)->>AstGrep: Create from OxcDoc
CLI (main.rs)->>Pattern: Compile pattern string
CLI (main.rs)->>AstGrep: Search root node with pattern
AstGrep->>CLI (main.rs): Return matches
CLI (main.rs)->>User: Print matched texts
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1970 +/- ##
==========================================
- Coverage 86.64% 84.86% -1.79%
==========================================
Files 100 102 +2
Lines 16046 16383 +337
==========================================
Hits 13903 13903
- Misses 2143 2480 +337 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (4)
crates/oxc/src/main.rs (3)
380-383
:field_to_id
returnsNone
– limits query expressiveness
Language::field_to_id
is currently unimplemented, which means$foo.bar
style field queries in patterns cannot work. Even a partial mapping for the common fields ("body"
,"argument"
, …) would greatly improve usability.Consider wiring
oxc_ast::ast::FIELD_MAP
(or similar) here, or transparently fall back to a slow string comparison.
205-376
: Gigantic match table can be generated – keep the source maintainableHard-coding 170-plus
match
arms is error-prone and hard to maintain.
Instead, generate the mapping once:lazy_static::lazy_static! { static ref KIND_TO_ID: HashMap<&'static str, u16> = IDS.iter().enumerate().map(|(i, k)| (*k, i as u16)).collect(); } fn kind_to_id(&self, kind: &str) -> u16 { *KIND_TO_ID.get(kind).unwrap_or_else(|| panic!("Unknown kind: {kind}")) }This avoids duplicating the data, shrinks the binary a bit, and removes the need to update two places when Oxc adds a new node kind.
548-550
:do_edit
not implemented –AstGrep::replace
& incremental features will breakThe trait requires an implementation; leaving it as
todo!()
prevents consumers from performing edits and makes the doc non-compliant withDoc
expectations. Even a pessimistic “full re-parse” implementation is preferable.Would you like assistance in adding a minimal, safe implementation?
crates/oxc/test.ts (1)
1-3
: Remove leftover commented-out console logThe commented line adds noise to the sample; if it’s not needed, delete it.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (4)
crates/core/src/source.rs
(1 hunks)crates/oxc/Cargo.toml
(1 hunks)crates/oxc/src/main.rs
(1 hunks)crates/oxc/test.ts
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
crates/core/src/source.rs (2)
crates/core/src/node.rs (3)
field
(241-247)field_children
(257-263)child_by_field_id
(249-255)crates/napi/src/sg_node.rs (2)
field
(309-317)field_children
(321-329)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: coverage
🔇 Additional comments (1)
crates/oxc/Cargo.toml (1)
13-20
: Pinning Oxc crates to an exact patch version may hinder workspace upgradesAll Oxc dependencies are fixed to
0.68.1
. When Oxc releases patch fixes,cargo update
will not pick them up. Consider using a caret range ("0.68"
), or add a comment explaining the rationale for strict pinning.Also, double-check that
oxc_index = "3"
aligns with0.68.x
– mismatched versions occasionally breakIdx
casts.
#[allow(unused_variables)] | ||
fn field(&self, name: &str) -> Option<Self> { | ||
None | ||
} | ||
#[allow(unused_variables)] | ||
fn field_children(&self, field_id: Option<u16>) -> impl Iterator<Item = Self> { | ||
std::iter::empty() | ||
} | ||
#[allow(unused_variables)] | ||
fn child_by_field_id(&self, field_id: u16) -> Option<Self> { | ||
None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking API change: field_children
signature no longer aligns with callers
field_children
now accepts Option<u16>
instead of &str
, but all existing call-sites (e.g. crates/napi/src/sg_node.rs
and crates/core/src/node.rs
shown in the snippets) still pass a field name (&str
).
Unless every downstream crate is updated simultaneously this will be a hard compile error.
- fn field_children(&self, field_id: Option<u16>) -> impl Iterator<Item = Self>;
+ fn field_children(&self, name: &str) -> impl Iterator<Item = Self>;
Please either:
- Revert the signature to the previous, name-based API, or
- Update every caller and provide a convenience helper that converts names to ids using
lang.field_to_id
, to avoid scatteringOption<u16>
everywhere.
Leaving it as-is will break the public trait and downstream crates.
Committable suggestion skipped: line range outside the PR's diff.
crates/oxc/src/main.rs
Outdated
self.inner.span.start as usize..self.inner.span.end as usize | ||
} | ||
fn start_pos(&self) -> ast_grep_core::Position { | ||
todo!("not implemented") | ||
} | ||
fn end_pos(&self) -> ast_grep_core::Position { | ||
todo!("not implemented") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start_pos
/ end_pos
are todo!()
– any call will panic at runtime
These trait methods are part of the public SgNode
contract. Leaving them unimplemented will cause a panic as soon as a consumer calls node.start_pos()
(e.g. error reporting, code actions).
Implement them using Span::start_utf16
/ end_utf16
(or UTF-8) depending on the expected encoding, or return a sensible default and document limitations.
crates/oxc/src/main.rs
Outdated
fn convert(nodes: &AstNodes) -> HashMap<NodeId, InnerOxc> { | ||
let mut pool = HashMap::default(); | ||
for node in nodes.iter() { | ||
let id = node.id(); | ||
let span = node.span(); | ||
let parent_id = nodes.parent_id(node.id()); | ||
let kind_id = node.kind().ty() as u16; | ||
pool.insert( | ||
id, | ||
InnerOxc { | ||
id, | ||
kind_id, | ||
parent_id, | ||
span, | ||
children: vec![], | ||
}, | ||
); | ||
if let Some(parent_id) = parent_id { | ||
if let Some(parent) = pool.get_mut(&parent_id) { | ||
parent.children.push(id); | ||
} | ||
} | ||
} | ||
pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Children list may be incomplete when parent is visited after child
convert
pushes a child id into its parent’s children
vector only if the parent has already been inserted into pool
.
AST traversals do not guarantee this ordering, so parents that appear after their children will silently miss the link.
A safe two-pass approach (first insert, then populate children) avoids the race:
fn convert(nodes: &AstNodes) -> HashMap<NodeId, InnerOxc> {
- let mut pool = HashMap::default();
- for node in nodes.iter() {
- // insert child, maybe update parent (if present)
- ...
- if let Some(parent_id) = parent_id {
- if let Some(parent) = pool.get_mut(&parent_id) {
- parent.children.push(id);
- }
- }
- }
+ let mut pool: HashMap<NodeId, InnerOxc> = nodes
+ .iter()
+ .map(|n| {
+ let id = n.id();
+ (
+ id,
+ InnerOxc {
+ id,
+ kind_id: n.kind().ty() as u16,
+ parent_id: nodes.parent_id(id),
+ span: n.span(),
+ children: Vec::new(),
+ },
+ )
+ })
+ .collect();
+
+ // second pass – populate children
+ for inner in pool.values() {
+ if let Some(pid) = inner.parent_id {
+ if let Some(parent) = pool.get_mut(&pid) {
+ parent.children.push(inner.id);
+ }
+ }
+ }
pool
}
Without this fix, pattern matching that navigates downward may return incomplete results.
Committable suggestion skipped: line range outside the PR's diff.
71f5d56
to
d1fb3ae
Compare
d1fb3ae
to
26cffdd
Compare
Question: Would we expose the underlying rust types, similar to how we do with the tree-sitter nodes? It's quite handy to be able to access the underlying nodes (if needed) in the Rust crates 😄 |
Yes ast-grep/crates/core/src/node.rs Lines 179 to 182 in 4eae151
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a proof-of-concept Oxc-based parser integration for ast-grep by introducing a new ast-grep-oxc
crate and updating the core SgNode
trait to include default stubs for field navigation.
- Adds an
oxc
crate with:- CLI entrypoint (
main.rs
) for running AST pattern searches via Oxc binding.rs
implementingLanguage
,Doc
, and node mapping from Oxc- Cargo manifest (
Cargo.toml
) for the new crate
- CLI entrypoint (
- Updates the core AST interface (
SgNode
insource.rs
) with default implementations of field navigation methods
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
File | Description |
---|---|
crates/oxc/src/main.rs | New CLI that drives parsing and pattern matching via Oxc |
crates/oxc/src/binding.rs | Core binding logic: builds AST, converts Oxc nodes into SgNode for ast-grep |
crates/oxc/Cargo.toml | Manifest for the ast-grep-oxc crate |
crates/core/src/source.rs | Added default stub implementations for field , field_children , child_by_field_id in SgNode |
Comments suppressed due to low confidence (3)
crates/oxc/src/main.rs:9
- [nitpick] The variable name
pat
is ambiguous; consider renaming topattern_str
orpattern_arg
for clarity.
let pat = args.get(2).expect("Must provide a pattern");
crates/oxc/src/binding.rs:1
- Add unit tests for
OxcDoc::try_new
, parse error handling, and pattern-building to ensure the Oxc integration is covered.
use ast_grep_core::matcher::{Pattern, PatternBuilder, PatternError};
crates/oxc/src/binding.rs:183
- The panic formatting is incorrect and will fail to compile; use named or positional parameters, e.g.,
panic!("Unknown kind: {}", kind)
.
_ => panic!("Unknown kind: {kind}"),
self.inner.span.start as usize..self.inner.span.end as usize | ||
} | ||
fn start_pos(&self) -> ast_grep_core::Position { | ||
todo!("not implemented") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both start_pos
and end_pos
are unimplemented and will panic at runtime; implement these to return correct node positions or provide default stubs.
Copilot uses AI. Check for mistakes.
#[allow(unused_variables)] | ||
fn field(&self, name: &str) -> Option<Self> { | ||
None | ||
} | ||
#[allow(unused_variables)] | ||
fn field_children(&self, field_id: Option<u16>) -> impl Iterator<Item = Self> { | ||
std::iter::empty() | ||
} | ||
#[allow(unused_variables)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These default stub methods (field
, field_children
, child_by_field_id
) always return empty results, which may break field navigation in Oxc bindings—consider implementing or documenting their intended behaviors.
#[allow(unused_variables)] | |
fn field(&self, name: &str) -> Option<Self> { | |
None | |
} | |
#[allow(unused_variables)] | |
fn field_children(&self, field_id: Option<u16>) -> impl Iterator<Item = Self> { | |
std::iter::empty() | |
} | |
#[allow(unused_variables)] | |
#[allow(unused_variables)] | |
/// Returns the child node associated with the given field name, if it exists. | |
/// | |
/// # Parameters | |
/// - `name`: The name of the field to look up. | |
/// | |
/// # Returns | |
/// An `Option` containing the child node if the field exists, or `None` otherwise. | |
/// | |
/// # Note | |
/// This method is expected to be overridden by implementors of the `SgNode` trait. | |
fn field(&self, name: &str) -> Option<Self> { | |
None | |
} | |
#[allow(unused_variables)] | |
/// Returns an iterator over the children of the node that belong to the specified field. | |
/// | |
/// # Parameters | |
/// - `field_id`: An optional field ID to filter the children. | |
/// | |
/// # Returns | |
/// An iterator over the child nodes associated with the field. By default, this returns an empty iterator. | |
/// | |
/// # Note | |
/// This method is expected to be overridden by implementors of the `SgNode` trait. | |
fn field_children(&self, field_id: Option<u16>) -> impl Iterator<Item = Self> { | |
std::iter::empty() | |
} | |
#[allow(unused_variables)] | |
/// Returns the child node associated with the given field ID, if it exists. | |
/// | |
/// # Parameters | |
/// - `field_id`: The ID of the field to look up. | |
/// | |
/// # Returns | |
/// An `Option` containing the child node if the field exists, or `None` otherwise. | |
/// | |
/// # Note | |
/// This method is expected to be overridden by implementors of the `SgNode` trait. |
Copilot uses AI. Check for mistakes.
let pat = args.get(2).expect("Must provide a pattern"); | ||
let path = std::path::Path::new(&name); | ||
let source_text = std::fs::read_to_string(path)?; | ||
let lang = OxcLang(SourceType::from_path(path).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using unwrap()
on SourceType::from_path
; handle the error case with a clear message to prevent unexpected panics on unsupported file extensions.
let lang = OxcLang(SourceType::from_path(path).unwrap()); | |
let lang = match SourceType::from_path(path) { | |
Ok(source_type) => OxcLang(source_type), | |
Err(_) => { | |
eprintln!("Error: Unsupported file extension for path {:?}", path); | |
std::process::exit(1); | |
} | |
}; |
Copilot uses AI. Check for mistakes.
pub struct OxcLang(pub SourceType); | ||
|
||
impl Language for OxcLang { | ||
fn kind_to_id(&self, kind: &str) -> u16 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The large match
block for mapping kinds to IDs is brittle; consider generating this mapping or using a lookup table/macro to reduce maintenance overhead.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (2)
crates/oxc/src/binding.rs (2)
269-274
:⚠️ Potential issueCritical: Implement
start_pos
andend_pos
methods to prevent runtime panics.These methods are part of the public
SgNode
trait contract. Leaving them unimplemented will cause a panic when any code tries to get position information (e.g., for error reporting or code actions).fn start_pos(&self) -> ast_grep_core::Position { - todo!("not implemented") + // Calculate position from span start + let pos = self.inner.span.start as usize; + // TODO: Implement proper line/column calculation from byte offset + ast_grep_core::Position { + line: 0, // Placeholder - needs proper implementation + column: pos, + } } fn end_pos(&self) -> ast_grep_core::Position { - todo!("not implemented") + // Calculate position from span end + let pos = self.inner.span.end as usize; + // TODO: Implement proper line/column calculation from byte offset + ast_grep_core::Position { + line: 0, // Placeholder - needs proper implementation + column: pos, + } }Alternatively, you could store line/column information during parsing or use a source map to convert byte offsets to positions.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 272-273: crates/oxc/src/binding.rs#L272-L273
Added lines #L272 - L273 were not covered by tests
203-227
:⚠️ Potential issueFix parent-child relationship ordering issue in
convert
function.The current implementation only updates parent's children list if the parent has already been inserted into the pool. This creates incomplete parent-child relationships when nodes are processed in certain orders.
Use a two-pass approach to ensure all parent-child relationships are properly established:
fn convert(nodes: &AstNodes) -> HashMap<NodeId, InnerOxc> { - let mut pool = HashMap::default(); - for node in nodes.iter() { - let id = node.id(); - let span = node.span(); - let parent_id = nodes.parent_id(node.id()); - let kind_id = node.kind().ty() as u16; - pool.insert( - id, - InnerOxc { - id, - kind_id, - parent_id, - span, - children: vec![], - }, - ); - if let Some(parent_id) = parent_id { - if let Some(parent) = pool.get_mut(&parent_id) { - parent.children.push(id); - } - } - } + // First pass: Create all nodes + let mut pool: HashMap<NodeId, InnerOxc> = nodes + .iter() + .map(|node| { + let id = node.id(); + ( + id, + InnerOxc { + id, + kind_id: node.kind().ty() as u16, + parent_id: nodes.parent_id(id), + span: node.span(), + children: Vec::new(), + }, + ) + }) + .collect(); + + // Second pass: Populate children + let node_ids: Vec<NodeId> = pool.keys().copied().collect(); + for node_id in node_ids { + if let Some(parent_id) = pool[&node_id].parent_id { + if let Some(parent) = pool.get_mut(&parent_id) { + parent.children.push(node_id); + } + } + } pool }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 203-224: crates/oxc/src/binding.rs#L203-L224
Added lines #L203 - L224 were not covered by tests
[warning] 226-227: crates/oxc/src/binding.rs#L226-L227
Added lines #L226 - L227 were not covered by tests
🧹 Nitpick comments (2)
crates/oxc/src/binding.rs (2)
186-188
: Implementfield_to_id
or document why it returns None.The method always returns
None
, which might limit field-based pattern matching capabilities. If this is intentional, please add a comment explaining why fields are not supported.Would you like me to help implement field mapping for Oxc AST nodes if field-based matching is needed?
357-359
: Implement thedo_edit
method or document its limitations.The method is marked as
todo!()
which will panic if called. This limits the usefulness of the Oxc integration for code modification scenarios.Would you like me to help implement the editing functionality? This would involve:
- Re-parsing the modified source
- Updating the AST node pool
- Maintaining node ID stability where possible
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (3)
crates/oxc/Cargo.toml
(1 hunks)crates/oxc/src/binding.rs
(1 hunks)crates/oxc/src/main.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/oxc/Cargo.toml
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: HerringtonDarkholme
PR: ast-grep/ast-grep#1497
File: crates/cli/src/utils/mod.rs:174-175
Timestamp: 2024-09-29T21:59:50.675Z
Learning: ast-grep requires Rust 1.65 or newer. It's acceptable to use syntax and features introduced in Rust 1.65 or newer in ast-grep code.
🧬 Code Graph Analysis (1)
crates/oxc/src/main.rs (2)
crates/core/src/node.rs (1)
doc
(54-56)crates/oxc/src/binding.rs (1)
try_new
(333-343)
🪛 GitHub Check: codecov/patch
crates/oxc/src/binding.rs
[warning] 14-183: crates/oxc/src/binding.rs#L14-L183
Added lines #L14 - L183 were not covered by tests
[warning] 185-191: crates/oxc/src/binding.rs#L185-L191
Added lines #L185 - L191 were not covered by tests
[warning] 203-224: crates/oxc/src/binding.rs#L203-L224
Added lines #L203 - L224 were not covered by tests
[warning] 226-227: crates/oxc/src/binding.rs#L226-L227
Added lines #L226 - L227 were not covered by tests
[warning] 236-270: crates/oxc/src/binding.rs#L236-L270
Added lines #L236 - L270 were not covered by tests
[warning] 272-273: crates/oxc/src/binding.rs#L272-L273
Added lines #L272 - L273 were not covered by tests
[warning] 286-330: crates/oxc/src/binding.rs#L286-L330
Added lines #L286 - L330 were not covered by tests
[warning] 333-343: crates/oxc/src/binding.rs#L333-L343
Added lines #L333 - L343 were not covered by tests
[warning] 351-358: crates/oxc/src/binding.rs#L351-L358
Added lines #L351 - L358 were not covered by tests
[warning] 360-370: crates/oxc/src/binding.rs#L360-L370
Added lines #L360 - L370 were not covered by tests
crates/oxc/src/main.rs
[warning] 6-20: crates/oxc/src/main.rs#L6-L20
Added lines #L6 - L20 were not covered by tests
fn main() -> std::io::Result<()> { | ||
let args = std::env::args().collect::<Vec<_>>(); | ||
let name = args.get(1).expect("Must provide a file"); | ||
let pat = args.get(2).expect("Must provide a pattern"); | ||
let path = std::path::Path::new(&name); | ||
let source_text = std::fs::read_to_string(path)?; | ||
let lang = OxcLang(SourceType::from_path(path).unwrap()); | ||
let doc = OxcDoc::try_new(source_text, lang).expect("Failed to parse"); | ||
let sg = AstGrep::doc(doc); | ||
let pattern = Pattern::new(pat, lang); | ||
for m in sg.root().find_all(&pattern) { | ||
println!("Oxc ast-grep Found: {}", m.text()); | ||
} | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling and user experience in the CLI tool.
The current implementation uses expect()
which provides poor user experience when errors occur. Consider implementing proper error handling with helpful messages.
fn main() -> std::io::Result<()> {
let args = std::env::args().collect::<Vec<_>>();
- let name = args.get(1).expect("Must provide a file");
- let pat = args.get(2).expect("Must provide a pattern");
+ if args.len() < 3 {
+ eprintln!("Usage: {} <file> <pattern>", args.get(0).unwrap_or(&"oxc".to_string()));
+ std::process::exit(1);
+ }
+ let name = &args[1];
+ let pat = &args[2];
+
let path = std::path::Path::new(&name);
+ if !path.exists() {
+ eprintln!("Error: File '{}' not found", name);
+ std::process::exit(1);
+ }
+
let source_text = std::fs::read_to_string(path)?;
- let lang = OxcLang(SourceType::from_path(path).unwrap());
- let doc = OxcDoc::try_new(source_text, lang).expect("Failed to parse");
+ let source_type = SourceType::from_path(path).unwrap_or_else(|_| {
+ eprintln!("Warning: Unknown file extension, defaulting to JavaScript");
+ SourceType::default()
+ });
+ let lang = OxcLang(source_type);
+
+ let doc = match OxcDoc::try_new(source_text, lang) {
+ Ok(doc) => doc,
+ Err(e) => {
+ eprintln!("Failed to parse file: {}", e);
+ std::process::exit(1);
+ }
+ };
+
let sg = AstGrep::doc(doc);
let pattern = Pattern::new(pat, lang);
for m in sg.root().find_all(&pattern) {
println!("Oxc ast-grep Found: {}", m.text());
}
Ok(())
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
fn main() -> std::io::Result<()> { | |
let args = std::env::args().collect::<Vec<_>>(); | |
let name = args.get(1).expect("Must provide a file"); | |
let pat = args.get(2).expect("Must provide a pattern"); | |
let path = std::path::Path::new(&name); | |
let source_text = std::fs::read_to_string(path)?; | |
let lang = OxcLang(SourceType::from_path(path).unwrap()); | |
let doc = OxcDoc::try_new(source_text, lang).expect("Failed to parse"); | |
let sg = AstGrep::doc(doc); | |
let pattern = Pattern::new(pat, lang); | |
for m in sg.root().find_all(&pattern) { | |
println!("Oxc ast-grep Found: {}", m.text()); | |
} | |
Ok(()) | |
} | |
fn main() -> std::io::Result<()> { | |
let args = std::env::args().collect::<Vec<_>>(); | |
if args.len() < 3 { | |
eprintln!( | |
"Usage: {} <file> <pattern>", | |
args.get(0).unwrap_or(&"oxc".to_string()) | |
); | |
std::process::exit(1); | |
} | |
let name = &args[1]; | |
let pat = &args[2]; | |
let path = std::path::Path::new(&name); | |
if !path.exists() { | |
eprintln!("Error: File '{}' not found", name); | |
std::process::exit(1); | |
} | |
let source_text = std::fs::read_to_string(path)?; | |
let source_type = SourceType::from_path(path).unwrap_or_else(|_| { | |
eprintln!("Warning: Unknown file extension, defaulting to JavaScript"); | |
SourceType::default() | |
}); | |
let lang = OxcLang(source_type); | |
let doc = match OxcDoc::try_new(source_text, lang) { | |
Ok(doc) => doc, | |
Err(e) => { | |
eprintln!("Failed to parse file: {}", e); | |
std::process::exit(1); | |
} | |
}; | |
let sg = AstGrep::doc(doc); | |
let pattern = Pattern::new(pat, lang); | |
for m in sg.root().find_all(&pattern) { | |
println!("Oxc ast-grep Found: {}", m.text()); | |
} | |
Ok(()) | |
} |
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 6-20: crates/oxc/src/main.rs#L6-L20
Added lines #L6 - L20 were not covered by tests
🤖 Prompt for AI Agents
In crates/oxc/src/main.rs from lines 6 to 20, replace the use of expect() calls
with proper error handling to improve user experience. Instead of panicking,
check if the command line arguments are present and return a user-friendly error
message if not. Similarly, handle errors from parsing and file reading
gracefully by returning or printing descriptive error messages rather than using
expect(). This will make the CLI tool more robust and informative for users.
fn end_pos(&self) -> ast_grep_core::Position { | ||
todo!("not implemented") | ||
} | ||
} | ||
|
||
#[derive(Clone)] | ||
pub struct OxcDoc { | ||
_allocator: Arc<Allocator>, | ||
source: String, | ||
lang: OxcLang, | ||
pool: HashMap<NodeId, InnerOxc>, | ||
root_id: Option<NodeId>, | ||
} | ||
|
||
fn parse( | ||
allocator: &Allocator, | ||
source_text: &String, | ||
lang: OxcLang, | ||
) -> Result<(HashMap<NodeId, InnerOxc>, Option<NodeId>), String> { | ||
// Parse the source text into an AST | ||
let parser_ret = Parser::new(allocator, source_text, lang.0).parse(); | ||
if !parser_ret.errors.is_empty() { | ||
let error_message: String = parser_ret | ||
.errors | ||
.into_iter() | ||
.map(|error| { | ||
format!( | ||
"{:?}\n", | ||
error.with_source_code(Arc::new(source_text.to_string())) | ||
) | ||
}) | ||
.collect(); | ||
return Err(error_message); | ||
} | ||
let program = parser_ret.program; | ||
|
||
let semantic = SemanticBuilder::new() | ||
// Enable additional syntax checks not performed by the parser | ||
.with_check_syntax_error(true) | ||
.build(&program); | ||
|
||
if !semantic.errors.is_empty() { | ||
let error_message: String = semantic | ||
.errors | ||
.into_iter() | ||
.map(|error| { | ||
format!( | ||
"{:?}\n", | ||
error.with_source_code(Arc::new(source_text.to_string())) | ||
) | ||
}) | ||
.collect(); | ||
return Err(error_message); | ||
} | ||
let nodes = semantic.semantic.nodes(); | ||
let pool = convert(nodes); | ||
let root_id = nodes.root(); | ||
Ok((pool, root_id)) | ||
} | ||
|
||
impl OxcDoc { | ||
pub fn try_new(source: String, lang: OxcLang) -> Result<Self, String> { | ||
let allocator = Allocator::default(); | ||
let (pool, root_id) = parse(&allocator, &source, lang)?; | ||
Ok(Self { | ||
_allocator: Arc::new(allocator), | ||
source, | ||
lang, | ||
pool, | ||
root_id, | ||
}) | ||
} | ||
} | ||
|
||
impl Doc for OxcDoc { | ||
type Source = String; | ||
type Lang = OxcLang; | ||
type Node<'r> = OxcNode<'r>; | ||
|
||
fn get_lang(&self) -> &Self::Lang { | ||
&self.lang | ||
} | ||
fn get_source(&self) -> &Self::Source { | ||
&self.source | ||
} | ||
fn do_edit(&mut self, _edit: &ast_grep_core::source::Edit<Self::Source>) -> Result<(), String> { | ||
todo!("not implemented") | ||
} | ||
fn root_node(&self) -> Self::Node<'_> { | ||
let root_id = self.root_id.expect("Root node not found"); | ||
let inner = self.pool.get(&root_id).expect("Root node not found"); | ||
OxcNode { | ||
inner: inner.clone(), | ||
pool: &self.pool, | ||
} | ||
} | ||
fn get_node_text<'a>(&'a self, node: &Self::Node<'a>) -> Cow<'a, str> { | ||
Cow::Borrowed(&self.source[node.range()]) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add comprehensive tests for the Oxc integration.
The static analysis shows that this entire module lacks test coverage. Given that this is a proof-of-concept integration, it's crucial to have tests that verify:
- AST node kind mapping
- Parent-child relationships
- Pattern matching functionality
- Error handling in parsing
- Edge cases like empty files or syntax errors
Would you like me to help create a comprehensive test suite for this module? This would include:
- Unit tests for individual components
- Integration tests for pattern matching
- Property-based tests for AST traversal correctness
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 14-183: crates/oxc/src/binding.rs#L14-L183
Added lines #L14 - L183 were not covered by tests
[warning] 185-191: crates/oxc/src/binding.rs#L185-L191
Added lines #L185 - L191 were not covered by tests
[warning] 203-224: crates/oxc/src/binding.rs#L203-L224
Added lines #L203 - L224 were not covered by tests
[warning] 226-227: crates/oxc/src/binding.rs#L226-L227
Added lines #L226 - L227 were not covered by tests
[warning] 236-270: crates/oxc/src/binding.rs#L236-L270
Added lines #L236 - L270 were not covered by tests
[warning] 272-273: crates/oxc/src/binding.rs#L272-L273
Added lines #L272 - L273 were not covered by tests
[warning] 286-330: crates/oxc/src/binding.rs#L286-L330
Added lines #L286 - L330 were not covered by tests
[warning] 333-343: crates/oxc/src/binding.rs#L333-L343
Added lines #L333 - L343 were not covered by tests
[warning] 351-358: crates/oxc/src/binding.rs#L351-L358
Added lines #L351 - L358 were not covered by tests
[warning] 360-370: crates/oxc/src/binding.rs#L360-L370
Added lines #L360 - L370 were not covered by tests
🤖 Prompt for AI Agents
In crates/oxc/src/binding.rs covering the entire module from lines 14 to 371,
there is no test coverage for the Oxc integration. To fix this, add a
comprehensive test suite that includes unit tests for the kind_to_id mapping and
other individual functions, integration tests verifying parent-child
relationships and pattern matching, tests for error handling during parsing
(including syntax errors), and edge case tests such as empty source input. This
will ensure the module's correctness and robustness.
fn root_node(&self) -> Self::Node<'_> { | ||
let root_id = self.root_id.expect("Root node not found"); | ||
let inner = self.pool.get(&root_id).expect("Root node not found"); | ||
OxcNode { | ||
inner: inner.clone(), | ||
pool: &self.pool, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle missing root node gracefully instead of panicking.
Using expect()
will cause a panic if the root node is not found. Consider returning an error or a dummy node.
fn root_node(&self) -> Self::Node<'_> {
- let root_id = self.root_id.expect("Root node not found");
- let inner = self.pool.get(&root_id).expect("Root node not found");
+ let root_id = self.root_id.unwrap_or_else(|| {
+ // This should not happen with valid parsing, but handle gracefully
+ panic!("Internal error: Root node not found in parsed document");
+ });
+ let inner = self.pool.get(&root_id).unwrap_or_else(|| {
+ panic!("Internal error: Root node {} not found in pool", root_id.index());
+ });
OxcNode {
inner: inner.clone(),
pool: &self.pool,
}
}
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 360-370: crates/oxc/src/binding.rs#L360-L370
Added lines #L360 - L370 were not covered by tests
🤖 Prompt for AI Agents
In crates/oxc/src/binding.rs around lines 360 to 367, the root_node function
uses expect() which causes a panic if the root node is missing. Refactor this
function to handle the missing root node gracefully by returning a Result type
or an Option type instead of panicking. This involves changing the return type
to Result<Self::Node<'_>, ErrorType> or Option<Self::Node<'_>> and returning an
appropriate error or None when the root node is not found, avoiding the use of
expect().
"Decorator" => 164, | ||
"TSExportAssignment" => 165, | ||
"TSInstantiationExpression" => 166, | ||
_ => panic!("Unknown kind: {kind}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace panic with proper error handling for unknown AST node kinds.
Panicking on unknown node kinds will crash the entire application. Consider returning a default value or an error to handle new/unknown node types gracefully.
- _ => panic!("Unknown kind: {kind}"),
+ _ => {
+ // Log warning and return a high ID to avoid conflicts
+ eprintln!("Warning: Unknown AST node kind: {}", kind);
+ u16::MAX
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
_ => panic!("Unknown kind: {kind}"), | |
_ => { | |
// Log warning and return a high ID to avoid conflicts | |
eprintln!("Warning: Unknown AST node kind: {}", kind); | |
u16::MAX | |
} |
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 14-183: crates/oxc/src/binding.rs#L14-L183
Added lines #L14 - L183 were not covered by tests
🤖 Prompt for AI Agents
In crates/oxc/src/binding.rs at line 183, replace the panic!("Unknown kind:
{kind}") with proper error handling by returning a Result or an Option that
indicates an error or unknown node kind instead of crashing. Modify the function
signature if needed to support error propagation, and ensure the caller can
handle the error gracefully without panicking.
fn kind(&self) -> Cow<str> { | ||
Cow::Owned(format!("{:?}", self.inner.kind_id)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return actual AST node kind names instead of debug format of kind_id.
The kind()
method returns the debug representation of the numeric ID rather than the actual node kind name, which will make pattern matching and debugging difficult.
fn kind(&self) -> Cow<str> {
- Cow::Owned(format!("{:?}", self.inner.kind_id))
+ // Convert kind_id back to string representation
+ let kind_name = match self.inner.kind_id {
+ 0 => "Program",
+ 1 => "IdentifierName",
+ 2 => "IdentifierReference",
+ // ... (use the same mapping from kind_to_id but in reverse)
+ _ => return Cow::Owned(format!("Unknown_{}", self.inner.kind_id)),
+ };
+ Cow::Borrowed(kind_name)
}
Consider creating a static array or using a macro to avoid duplicating the mapping logic.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/oxc/src/binding.rs around lines 257 to 259, the kind() method
currently returns a debug string of the numeric kind_id, which is not helpful
for pattern matching or debugging. Replace this by returning the actual AST node
kind name as a string. Implement a static array or use a macro that maps kind_id
values to their corresponding node kind names, and return the appropriate name
from this mapping instead of the debug format.
Summary by CodeRabbit