-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Fix match mapping pattern #6081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis update implements comprehensive support for Python's mapping (dictionary) pattern matching in the compiler, virtual machine, and tests. It introduces full validation, stack management, and error handling for mapping patterns, updates stack effect calculations, and expands test coverage for dictionary pattern matching scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Compiler
participant VM
participant Bytecode
participant TestSuite
TestSuite->>Compiler: Define mapping pattern tests
Compiler->>Bytecode: Emit MatchMapping, MatchKeys, etc.
Bytecode->>VM: Execute bytecode instructions
VM->>VM: Check if subject is mapping (MatchMapping)
VM->>VM: Retrieve keys and values (MatchKeys)
VM->>VM: Push keys and values or None to stack
VM->>TestSuite: Return match result
TestSuite->>TestSuite: Assert correctness
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Poem
Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (2)
vm/src/frame.rs (2)
682-696
: Consider safer error handling instead of unwrap().While the compiler should guarantee valid indices, using
unwrap()
can cause panics if there are bugs in the compiler or unexpected edge cases. Consider using a more defensive approach:bytecode::Instruction::CopyItem { index } => { - // CopyItem { index: 1 } copies TOS - // CopyItem { index: 2 } copies second from top - // This is 1-indexed to match CPython - let idx = index.get(arg) as usize; let value = self .state .stack .len() - .checked_sub(idx) - .map(|i| &self.state.stack[i]) - .unwrap(); + .checked_sub(index.get(arg) as usize) + .and_then(|i| self.state.stack.get(i)) + .ok_or_else(|| vm.new_runtime_error("Invalid stack index for CopyItem"))?; self.push_value(value.clone()); Ok(None) }
702-711
: Add bounds checking for stack operations.The current implementation could panic if
index_val
is greater than the stack length. Consider adding bounds checking:bytecode::Instruction::Swap { index } => { let len = self.state.stack.len(); + let index_val = index.get(arg) as usize; + if index_val > len { + return Err(vm.new_runtime_error("Swap index out of bounds")); + } let i = len - 1; // TOS index - let index_val = index.get(arg) as usize; - // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n] - // This means swap TOS with the element at index (len - n) let j = len - index_val; self.state.stack.swap(i, j); Ok(None) }
🧹 Nitpick comments (1)
compiler/codegen/src/compile.rs (1)
3499-3675
: Well-implemented mapping pattern matching with appropriate error handlingThe implementation is comprehensive and handles the core functionality correctly:
- Proper validation of keys and patterns
- Correct stack management throughout
- Appropriate error messages for unsupported features
Note that **rest patterns are not yet supported, but the error message clearly indicates this limitation.
Consider improving the duplicate key detection mechanism. Currently using
format!("{key:?}")
for comparison (line 3583) might have edge cases where different keys produce the same debug representation. A more robust approach would be to use the actual constant values for comparison when possible.
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
compiler/codegen/src/compile.rs
(5 hunks)compiler/core/src/bytecode.rs
(1 hunks)extra_tests/snippets/syntax_match.py
(3 hunks)vm/src/frame.rs
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.py
: Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
extra_tests/snippets/syntax_match.py
**/*.rs
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
compiler/core/src/bytecode.rs
vm/src/frame.rs
compiler/codegen/src/compile.rs
🧠 Learnings (1)
📚 Learning: 2025-07-19T03:16:56.511Z
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
compiler/codegen/src/compile.rs
🔇 Additional comments (10)
compiler/core/src/bytecode.rs (1)
1426-1427
: LGTM! Stack effects properly updated for pattern matching instructions.The stack effect calculations are correct and well-documented:
MatchMapping
andMatchSequence
now properly reflect pushing a boolean result (+1)MatchKeys
correctly accounts for popping 2 items and pushing 3 items (net +1)The inline comments clearly explain the stack behavior, which aligns with the VM implementation changes mentioned in the AI summary.
extra_tests/snippets/syntax_match.py (4)
57-71
: LGTM! Well-structured basic mapping pattern tests.The basic dictionary pattern matching tests are comprehensive and well-designed:
- Single key extraction test validates fundamental functionality
- Multiple key extraction test ensures proper handling of complex patterns
- Proper use of fallback cases with
assert False
to catch unexpected behavior- Clear test data and assertions
72-79
: Good approach to handling unimplemented features.The commented-out rest pattern tests are appropriately handled:
- Clear TODO comment indicating the **rest pattern is not yet implemented
- Test structure is preserved for future implementation
- Doesn't interfere with current functionality
This is a good practice for incremental feature development.
80-98
: Excellent coverage of wildcard fallback scenarios.The wildcard fallback tests provide valuable coverage:
- Tests both successful pattern matching and fallback behavior
- Reproduces a real-world issue (wheelinfo.py), which is great for regression testing
- Clear variable naming and comprehensive assertions
- Proper handling of non-matching scenarios
These tests strengthen the overall test suite significantly.
100-124
: Well-structured comprehensive test function.The comprehensive test function demonstrates good testing practices:
- Clear separation of test cases within a dedicated function
- Descriptive assertions with helpful error messages for debugging
- Covers both single and multiple key capture scenarios
- Immediate execution ensures tests are actually run
The success message provides good feedback when tests pass.
vm/src/frame.rs (2)
1274-1281
: LGTM! Correct implementation of mapping pattern matching.The implementation correctly preserves the subject on the stack while checking if it's a mapping, which aligns with CPython's pattern matching semantics. The use of
PyMapping::check
is appropriate for detecting mapping protocol compliance.
1282-1289
: LGTM! Consistent implementation with MatchMapping.The sequence checking logic correctly mirrors the mapping pattern matching approach, properly preserving the subject while determining if it supports the sequence protocol.
compiler/codegen/src/compile.rs (3)
3153-3157
: Good documentation addition!The comment clearly explains the rotation algorithm using swaps, which helps future maintainers understand this non-obvious implementation.
3463-3468
: Correct fix for the comparison operatorThe change from
Is
toIsNot
properly checks if MATCH_CLASS succeeded. The logic now correctly jumps to fail when the result is None.
3920-3922
: Correct activation of mapping pattern supportEnabling the
Pattern::MatchMapping
case properly integrates the new implementation into the pattern matching system.
bytecode::Instruction::MatchKeys => { | ||
// Typically we pop a sequence of keys first | ||
let _keys = self.pop_value(); | ||
// Pop keys tuple and subject | ||
let keys_tuple = self.pop_value(); | ||
let subject = self.pop_value(); | ||
|
||
// Check if subject is a dict (or mapping) and all keys match | ||
if let Ok(_dict) = subject.downcast::<PyDict>() { | ||
// Example: gather the values corresponding to keys | ||
// If keys match, push the matched values & success | ||
self.push_value(vm.ctx.new_bool(true).into()); | ||
// Push the subject back first | ||
self.push_value(subject.clone()); | ||
|
||
// Check if subject is a mapping and extract values for keys | ||
if PyMapping::check(&subject) { | ||
let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap(); | ||
let mut values = Vec::new(); | ||
let mut all_match = true; | ||
|
||
for key in keys { | ||
match subject.get_item(key.as_object(), vm) { | ||
Ok(value) => values.push(value), | ||
Err(_) => { | ||
all_match = false; | ||
break; | ||
} | ||
} | ||
} | ||
|
||
if all_match { | ||
// Push keys_or_none (the original keys) and values_or_none | ||
// Keys should remain as they were for potential **rest handling | ||
self.push_value(keys_tuple); // keys_or_none | ||
self.push_value(vm.ctx.new_tuple(values).into()); // values_or_none | ||
} else { | ||
// No match - push None twice | ||
self.push_value(vm.ctx.none()); | ||
self.push_value(vm.ctx.none()); | ||
} | ||
} else { | ||
// Push a placeholder to indicate no match | ||
self.push_value(vm.ctx.new_bool(false).into()); | ||
// Not a mapping - push None twice | ||
self.push_value(vm.ctx.none()); | ||
self.push_value(vm.ctx.none()); | ||
} | ||
Ok(None) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Complex but correct MatchKeys implementation - consider safer error handling.
The implementation correctly handles all the pattern matching scenarios for dictionary keys, but there are a few areas for improvement:
bytecode::Instruction::MatchKeys => {
- // Pop keys tuple and subject
- let keys_tuple = self.pop_value();
+ let keys_tuple = self.pop_value();
let subject = self.pop_value();
- // Push the subject back first
self.push_value(subject.clone());
- // Check if subject is a mapping and extract values for keys
if PyMapping::check(&subject) {
- let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap();
+ let keys = keys_tuple.downcast_ref::<PyTuple>()
+ .ok_or_else(|| vm.new_runtime_error("MatchKeys expects tuple of keys"))?;
let mut values = Vec::new();
let mut all_match = true;
for key in keys {
match subject.get_item(key.as_object(), vm) {
Ok(value) => values.push(value),
Err(_) => {
all_match = false;
break;
}
}
}
if all_match {
- // Push keys_or_none (the original keys) and values_or_none
- // Keys should remain as they were for potential **rest handling
- self.push_value(keys_tuple); // keys_or_none
- self.push_value(vm.ctx.new_tuple(values).into()); // values_or_none
+ self.push_value(keys_tuple);
+ self.push_value(vm.ctx.new_tuple(values).into());
} else {
- // No match - push None twice
self.push_value(vm.ctx.none());
self.push_value(vm.ctx.none());
}
} else {
- // Not a mapping - push None twice
self.push_value(vm.ctx.none());
self.push_value(vm.ctx.none());
}
Ok(None)
}
The logic is sound, but replacing unwrap()
with proper error handling makes it more robust against unexpected compiler output.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
bytecode::Instruction::MatchKeys => { | |
// Typically we pop a sequence of keys first | |
let _keys = self.pop_value(); | |
// Pop keys tuple and subject | |
let keys_tuple = self.pop_value(); | |
let subject = self.pop_value(); | |
// Check if subject is a dict (or mapping) and all keys match | |
if let Ok(_dict) = subject.downcast::<PyDict>() { | |
// Example: gather the values corresponding to keys | |
// If keys match, push the matched values & success | |
self.push_value(vm.ctx.new_bool(true).into()); | |
// Push the subject back first | |
self.push_value(subject.clone()); | |
// Check if subject is a mapping and extract values for keys | |
if PyMapping::check(&subject) { | |
let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap(); | |
let mut values = Vec::new(); | |
let mut all_match = true; | |
for key in keys { | |
match subject.get_item(key.as_object(), vm) { | |
Ok(value) => values.push(value), | |
Err(_) => { | |
all_match = false; | |
break; | |
} | |
} | |
} | |
if all_match { | |
// Push keys_or_none (the original keys) and values_or_none | |
// Keys should remain as they were for potential **rest handling | |
self.push_value(keys_tuple); // keys_or_none | |
self.push_value(vm.ctx.new_tuple(values).into()); // values_or_none | |
} else { | |
// No match - push None twice | |
self.push_value(vm.ctx.none()); | |
self.push_value(vm.ctx.none()); | |
} | |
} else { | |
// Push a placeholder to indicate no match | |
self.push_value(vm.ctx.new_bool(false).into()); | |
// Not a mapping - push None twice | |
self.push_value(vm.ctx.none()); | |
self.push_value(vm.ctx.none()); | |
} | |
Ok(None) | |
} | |
bytecode::Instruction::MatchKeys => { | |
- // Pop keys tuple and subject | |
- let keys_tuple = self.pop_value(); | |
+ let keys_tuple = self.pop_value(); | |
let subject = self.pop_value(); | |
- // Push the subject back first | |
self.push_value(subject.clone()); | |
- // Check if subject is a mapping and extract values for keys | |
if PyMapping::check(&subject) { | |
- let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap(); | |
+ let keys = keys_tuple.downcast_ref::<PyTuple>() | |
+ .ok_or_else(|| vm.new_runtime_error("MatchKeys expects tuple of keys"))?; | |
let mut values = Vec::new(); | |
let mut all_match = true; | |
for key in keys { | |
match subject.get_item(key.as_object(), vm) { | |
Ok(value) => values.push(value), | |
Err(_) => { | |
all_match = false; | |
break; | |
} | |
} | |
} | |
if all_match { | |
- self.push_value(keys_tuple); // keys_or_none | |
- self.push_value(vm.ctx.new_tuple(values).into()); // values_or_none | |
+ self.push_value(keys_tuple); | |
+ self.push_value(vm.ctx.new_tuple(values).into()); | |
} else { | |
- // No match - push None twice | |
self.push_value(vm.ctx.none()); | |
self.push_value(vm.ctx.none()); | |
} | |
} else { | |
- // Not a mapping - push None twice | |
self.push_value(vm.ctx.none()); | |
self.push_value(vm.ctx.none()); | |
} | |
Ok(None) | |
} |
🤖 Prompt for AI Agents
In vm/src/frame.rs around lines 1290 to 1330, the use of unwrap() on
keys_tuple.downcast_ref::<PyTuple>() can cause a panic if the downcast fails.
Replace unwrap() with proper error handling by checking if the downcast returns
Some and handling the None case gracefully, such as returning an error or
pushing None values as fallback. This will make the MatchKeys implementation
more robust and prevent unexpected panics.
c77c00b
to
1241a6e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
compiler/codegen/src/compile.rs (2)
3499-3675
: Consider extracting helper functions to improve readability.The
compile_pattern_mapping
function is quite long (176 lines) and handles multiple responsibilities. Consider extracting helper functions for:
- Key validation and duplicate checking (lines 3566-3599)
- Keys tuple building (lines 3570-3609)
- Pattern processing (lines 3636-3653)
This would improve maintainability and testability.
3662-3668
: **TODO for rest pattern is properly handled.The implementation correctly returns an error for the unimplemented
**rest
pattern feature. When implementing this feature, you'll need to handle:
- Building a dictionary of remaining items
- Using
BUILD_MAP
andDICT_UPDATE
instructions- Proper stack management for the rest dictionary
Would you like me to help implement the **rest pattern support or create an issue to track this enhancement?
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
compiler/codegen/src/compile.rs
(5 hunks)compiler/core/src/bytecode.rs
(1 hunks)extra_tests/snippets/syntax_match.py
(3 hunks)vm/src/frame.rs
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- extra_tests/snippets/syntax_match.py
- compiler/core/src/bytecode.rs
- vm/src/frame.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
compiler/codegen/src/compile.rs
🧠 Learnings (1)
📚 Learning: 2025-07-19T03:16:56.511Z
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
compiler/codegen/src/compile.rs
🔇 Additional comments (5)
compiler/codegen/src/compile.rs (5)
29-32
: LGTM! Necessary imports added for mapping pattern support.The added imports (
PatternMatchMapping
,PatternMatchOr
,PatternMatchSequence
, etc.) are required for the new mapping pattern compilation functionality.
3153-3157
: Good documentation addition for the rotation logic.The comment clearly explains how the stack rotation is implemented through a series of swaps, making the code more maintainable.
3463-3468
: Better code consistency with explicit instruction format.Using the explicit
Instruction::TestOperation
format improves readability and consistency with other instruction emissions in the codebase.
3499-3675
: Comprehensive implementation of mapping pattern matching!The implementation correctly handles all the required aspects of mapping patterns including validation, stack management, and error handling. The code properly maintains the stack state throughout the operation.
3566-3590
: Consider the differences between HashSet and Python's PySet for duplicate detection.The current implementation uses
HashSet<String>
withformat!("{key:?}")
for duplicate key detection, which may not exactly match CPython's behavior. For example, different AST nodes that represent the same value might produce different debug representations.Consider if this difference is acceptable or if a more sophisticated comparison is needed.
Would this allow us to do something like: match x:
case str(s):
print(s) ? or this is more complicated? |
That case look like a class case. I mostly edited mapping part. So probably not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
that is handled by |
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation