Skip to content

SymbolTable::varnames, fblock #5948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 11, 2025
Merged

SymbolTable::varnames, fblock #5948

merged 5 commits into from
Jul 11, 2025

Conversation

youknowone
Copy link
Member

@youknowone youknowone commented Jul 11, 2025

Summary by CodeRabbit

  • New Features

    • Improved handling of loop control flow (break and continue) for more accurate and robust code generation.
    • Enhanced tracking of variable names in definition order within each scope.
  • Refactor

    • Consolidated code object metadata into a dedicated structure for better organization and maintainability.
    • Introduced a more structured mechanism for managing nested control flow blocks during code generation.

Copy link
Contributor

coderabbitai bot commented Jul 11, 2025

Warning

Rate limit exceeded

@youknowone has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 19 minutes and 35 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 772e213 and ab14160.

📒 Files selected for processing (2)
  • compiler/codegen/src/compile.rs (25 hunks)
  • compiler/codegen/src/ir.rs (5 hunks)

Walkthrough

The changes introduce a structured mechanism for managing control flow blocks (such as loops and exception handlers) during code generation by adding an FBlockType enum and FBlockInfo struct, along with stack management methods. Code unit metadata is consolidated into a new CodeUnitMetadata struct, and variable name definition order tracking is added to the symbol table.

Changes

File(s) Change Summary
compiler/codegen/src/compile.rs Added FBlockType enum, FBlockInfo struct, and MAXBLOCKS constant; implemented push_fblock and pop_fblock methods; refactored control flow handling to use the new fblock stack; reorganized code object metadata into metadata field; updated loop and break/continue logic.
compiler/codegen/src/ir.rs Introduced CodeUnitMetadata struct consolidating code unit metadata; refactored CodeInfo to use metadata field; updated related methods and trait implementations to use the new structure; added fields for static attributes, inlined comprehensions, and fblock stack.
compiler/codegen/src/symboltable.rs Added varnames field to SymbolTable to track variable definition order; updated SymbolTableBuilder to collect and assign varnames per scope during symbol registration and scope transitions.

Sequence Diagram(s)

sequenceDiagram
    participant Compiler
    participant CodeInfo
    participant FBlockStack
    participant SymbolTable

    Compiler->>CodeInfo: Start compiling code unit
    CodeInfo->>SymbolTable: Retrieve varnames in definition order
    CodeInfo->>FBlockStack: push_fblock (on entering loop/try)
    loop For each statement
        Compiler->>FBlockStack: push_fblock/pop_fblock as control flow changes
        Compiler->>FBlockStack: Search for enclosing loop on break/continue
    end
    CodeInfo->>CodeUnitMetadata: Update metadata (names, consts, etc.)
    CodeInfo->>Compiler: Finalize code object
Loading

Possibly related PRs

  • PyCode::qualname #5929: The main PR reorganizes internal metadata by introducing a CodeUnitMetadata struct that includes a qualname field and updates CodeInfo to use this new structure, while the retrieved PR specifically adds and manages the qualname field in CodeInfo and related compilation logic; thus, both PRs modify the same qualname field within CodeInfo and its handling in the compiler code generation, making their changes directly related.

Poem

In the warren of code, new blocks take their place,
FBlocks now stack with elegant grace.
Metadata gathered, all neat in a row,
Varnames in order, like seeds we now sow.
Loop and break, with structure anew—
This bunny hops forward, compiling for you! 🐇

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
compiler/codegen/src/compile.rs (2)

1150-1205: Consider refactoring duplicated loop-finding logic.

The Break and Continue statement handlers have nearly identical code for finding the innermost loop in the fblock stack. Consider extracting this into a helper method to reduce duplication.

+fn find_innermost_loop(&self) -> Option<(BlockIdx, BlockIdx)> {
+    let code = self.current_code_info();
+    for i in (0..code.fblock.len()).rev() {
+        match code.fblock[i].fb_type {
+            FBlockType::WhileLoop | FBlockType::ForLoop => {
+                return Some((code.fblock[i].fb_block, code.fblock[i].fb_exit));
+            }
+            _ => continue,
+        }
+    }
+    None
+}

 Stmt::Break(_) => {
-    // Find the innermost loop in fblock stack
-    let found_loop = {
-        let code = self.current_code_info();
-        let mut result = None;
-        for i in (0..code.fblock.len() as usize).rev() {
-            match code.fblock[i].fb_type {
-                FBlockType::WhileLoop | FBlockType::ForLoop => {
-                    result = Some(code.fblock[i].fb_exit);
-                    break;
-                }
-                _ => continue,
-            }
-        }
-        result
-    };
-
-    match found_loop {
-        Some(exit_block) => {
+    match self.find_innermost_loop() {
+        Some((_, exit_block)) => {
             emit!(self, Instruction::Break { target: exit_block });
         }
         None => {
             return Err(
                 self.error_ranged(CodegenErrorType::InvalidBreak, statement.range())
             );
         }
     }
 }
 Stmt::Continue(_) => {
-    // Find the innermost loop in fblock stack
-    let found_loop = {
-        let code = self.current_code_info();
-        let mut result = None;
-        for i in (0..code.fblock.len() as usize).rev() {
-            match code.fblock[i].fb_type {
-                FBlockType::WhileLoop | FBlockType::ForLoop => {
-                    result = Some(code.fblock[i].fb_block);
-                    break;
-                }
-                _ => continue,
-            }
-        }
-        result
-    };
-
-    match found_loop {
-        Some(loop_block) => {
+    match self.find_innermost_loop() {
+        Some((loop_block, _)) => {
             emit!(self, Instruction::Continue { target: loop_block });
         }
         None => {
             return Err(
                 self.error_ranged(CodegenErrorType::InvalidContinue, statement.range())
             );
         }
     }
 }

1155-1155: Remove unnecessary cast to usize.

The expression code.fblock.len() as usize has a redundant cast since Vec::len() already returns usize.

-for i in (0..code.fblock.len() as usize).rev() {
+for i in (0..code.fblock.len()).rev() {

Also applies to: 1183-1183

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3967bf and eb0d23c.

📒 Files selected for processing (3)
  • compiler/codegen/src/compile.rs (25 hunks)
  • compiler/codegen/src/ir.rs (5 hunks)
  • compiler/codegen/src/symboltable.rs (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.rs`: Follow the default rustfmt code style (`cargo fmt` to format) Always ...

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

📄 Source: CodeRabbit Inference Engine (.github/copilot-instructions.md)

List of files the instruction was applied to:

  • compiler/codegen/src/symboltable.rs
  • compiler/codegen/src/ir.rs
  • compiler/codegen/src/compile.rs
🧠 Learnings (1)
compiler/codegen/src/compile.rs (1)

undefined

<retrieved_learning>
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-06-30T10:08:48.858Z
Learning: Applies to **/*.rs : Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust
</retrieved_learning>

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with rustfmt and clippy
🔇 Additional comments (8)
compiler/codegen/src/symboltable.rs (1)

49-51: Good addition for tracking variable definition order

The varnames field properly captures the requirement to store variable names in definition order. The integration with the symbol table building process is well-designed.

compiler/codegen/src/ir.rs (4)

10-26: Well-structured metadata consolidation

The CodeUnitMetadata struct effectively consolidates all code unit-related metadata fields that were previously scattered in CodeInfo. The field names are clear and the comment references to the CPython equivalents (u_name, u_consts, etc.) help with understanding the mapping.


94-103: Clean refactoring with useful additions

The refactoring successfully consolidates metadata into a single field while adding new functionality:

  • fblock stack for tracking nested control structures (loops, exception handlers)
  • static_attributes for class scope attribute tracking
  • in_inlined_comp flag for comprehension context

All these additions align well with the PR objectives for enhanced control flow management.


211-239: Correct migration to metadata access

The cell2arg() method has been properly updated to access all required fields from the metadata struct. The logic remains unchanged while the field access is now properly scoped.


330-347: Consistent InstrDisplayContext implementation

All methods in the InstrDisplayContext implementation have been correctly updated to fetch data from the metadata field. The implementation maintains the same interface while adapting to the new structure.

compiler/codegen/src/compile.rs (3)

18-43: Well-structured control flow block types.

The FBlockType enum and FBlockInfo struct provide a clean abstraction for managing different control flow contexts. The design aligns well with CPython's approach while adapting to RustPython's needs.


2129-2130: Correct integration of fblock tracking in loops.

The while and for loop implementations properly push and pop fblocks at the appropriate points, ensuring the control flow stack remains consistent.

Also applies to: 2144-2146, 2258-2259, 2283-2285, 2299-2301


337-354: Clean metadata organization and proper initialization.

The consolidation of metadata into CodeUnitMetadata improves code organization. The initialization of varname_cache from SymbolTable::varnames ensures variable names are properly tracked in definition order.

Also applies to: 438-440, 467-474

Comment on lines +1579 to 1586
// Named expressions in comprehensions might also be locals
if symbol.scope == SymbolScope::Local {
let name_str = symbol.name.clone();
if !self.current_varnames.contains(&name_str) {
self.current_varnames.push(name_str);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Same timing issue for named expressions in comprehensions

This code has the same issue as above - checking symbol.scope == SymbolScope::Local during registration won't work correctly since scopes are determined later.

🤖 Prompt for AI Agents
In compiler/codegen/src/symboltable.rs around lines 1579 to 1586, the code
checks if symbol.scope equals SymbolScope::Local during registration, but this
is premature because symbol scopes are assigned later. To fix this, remove or
defer the check for symbol.scope == SymbolScope::Local until after scopes have
been fully determined, ensuring that the logic depending on local scope is
executed only when the scope information is accurate.

Comment on lines +1568 to 1576
// Local variables (assigned) are added to varnames if they are local scope
// and not already in varnames
if symbol.scope == SymbolScope::Local {
let name_str = symbol.name.clone();
if !self.current_varnames.contains(&name_str) {
self.current_varnames.push(name_str);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential timing issue with scope determination

The check if symbol.scope == SymbolScope::Local may not work as expected because the symbol's scope might still be SymbolScope::Unknown at registration time. The scope is determined later during the analyze_symbol_table phase.

This could result in some local variables not being added to varnames.

Consider either:

  1. Deferring the population of varnames until after the analysis phase when scopes are finalized
  2. Adding all assigned variables to varnames and filtering out non-locals later
  3. Moving this logic to the analyze_symbol method where scopes are determined
🤖 Prompt for AI Agents
In compiler/codegen/src/symboltable.rs around lines 1568 to 1576, the current
check for symbol.scope being Local may fail because scope is Unknown at this
point. To fix this, move the logic that adds assigned local variables to
current_varnames into the analyze_symbol method where symbol scopes are
finalized, ensuring only confirmed local variables are added after scope
determination.

Comment on lines 509 to 515
// = compiler_pop_fblock
fn pop_fblock(&mut self, _expected_type: FBlockType) -> FBlockInfo {
let code = self.current_code_info();
let fblock = code.fblock.pop().expect("fblock stack underflow");
// TODO: Add assertion to check expected type matches
// assert!(matches!(fblock.fb_type, expected_type));
fblock
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Complete the fblock type validation.

The pop_fblock method accepts an _expected_type parameter but doesn't validate it. This could lead to subtle bugs if blocks are popped in the wrong order.

Consider implementing the type check as indicated by the TODO:

 fn pop_fblock(&mut self, _expected_type: FBlockType) -> FBlockInfo {
     let code = self.current_code_info();
     let fblock = code.fblock.pop().expect("fblock stack underflow");
-    // TODO: Add assertion to check expected type matches
-    // assert!(matches!(fblock.fb_type, expected_type));
+    assert!(
+        matches!(fblock.fb_type, _expected_type),
+        "Expected {:?} but got {:?}",
+        _expected_type,
+        fblock.fb_type
+    );
     fblock
 }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In compiler/codegen/src/compile.rs around lines 509 to 516, the pop_fblock
method takes an _expected_type parameter but does not validate that the popped
fblock's type matches it. To fix this, implement an assertion that checks if
fblock.fb_type matches the expected_type parameter before returning fblock. This
ensures the block types are consistent and prevents subtle bugs from popping
blocks in the wrong order.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
compiler/codegen/src/compile.rs (1)

486-515: The TODO comment for type validation should be addressed.

The pop_fblock method accepts an _expected_type parameter but doesn't validate it against the actual block type being popped. This could lead to subtle bugs if blocks are popped in the wrong order.

The previous review comment about implementing type validation is still applicable and should be addressed.

🧹 Nitpick comments (1)
compiler/codegen/src/compile.rs (1)

2132-2134: Consider removing redundant loop_data context.

The code still maintains the old loop_data context alongside the new fblock system. This appears redundant since the fblock stack now handles loop context tracking. Consider removing the old loop_data mechanism to avoid confusion and potential inconsistencies.

- let was_in_loop = self.ctx.loop_data.replace((while_block, after_block));
  self.compile_statements(body)?;
- self.ctx.loop_data = was_in_loop;

Also applies to: 2291-2293

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb0d23c and 772e213.

📒 Files selected for processing (2)
  • compiler/codegen/src/compile.rs (25 hunks)
  • compiler/codegen/src/ir.rs (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • compiler/codegen/src/ir.rs
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.rs`: Follow the default rustfmt code style (`cargo fmt` to format) Always ...

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

📄 Source: CodeRabbit Inference Engine (.github/copilot-instructions.md)

List of files the instruction was applied to:

  • compiler/codegen/src/compile.rs
🧠 Learnings (1)
compiler/codegen/src/compile.rs (1)

undefined

<retrieved_learning>
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-06-30T10:08:48.858Z
Learning: Applies to **/*.rs : Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust
</retrieved_learning>

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
🔇 Additional comments (6)
compiler/codegen/src/compile.rs (6)

18-43: LGTM: Well-structured control flow block management.

The FBlockType enum and FBlockInfo struct provide a clear and comprehensive way to manage different types of control flow blocks during compilation. The enum covers all necessary block types including loops, exception handlers, and async constructs.


1149-1204: Excellent implementation of break/continue with fblock stack.

The new break and continue statement handling correctly searches the fblock stack from top to bottom to find the nearest enclosing loop. Using fb_exit for break statements and fb_block for continue statements is the correct approach.


2127-2144: Good integration of fblock management in while loops.

The while loop compilation correctly pushes and pops the fblock around the loop body, enabling proper break/continue handling.


2255-2299: Proper fblock management for for loops.

The for loop compilation (both sync and async versions) correctly integrates fblock management, matching the pattern used in while loops.


337-353: Good consolidation of metadata into CodeUnitMetadata.

The refactoring to consolidate code unit metadata fields into a single CodeUnitMetadata struct improves code organization and makes the codebase more maintainable.

Also applies to: 453-473


279-280: Proper initialization of inlined comprehension flag.

The in_inlined_comp field is correctly initialized and will be properly managed for comprehension contexts.

@youknowone youknowone force-pushed the fblock branch 2 times, most recently from beb9432 to 2a8ee18 Compare July 11, 2025 13:40
@youknowone youknowone merged commit c4234c1 into RustPython:main Jul 11, 2025
11 checks passed
@youknowone youknowone deleted the fblock branch July 11, 2025 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant