Skip to content

Renewd DictUpdate instruction #6085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 8, 2025
Merged

Conversation

youknowone
Copy link
Member

@youknowone youknowone commented Aug 8, 2025

Summary by CodeRabbit

  • New Features

    • The dictionary update operation now supports specifying which dictionary on the stack to update using an index argument, enabling more flexible updates.
  • Bug Fixes

    • Improved handling of dictionary updates to ensure correct merging of mappings based on the provided index.
  • Refactor

    • Updated the internal structure of the dictionary update instruction to include an index parameter for consistency and clarity.

Copy link
Contributor

coderabbitai bot commented Aug 8, 2025

Walkthrough

The changes update the DictUpdate instruction across the compiler, bytecode, and VM execution layers to accept an explicit index argument. This involves modifying the instruction's definition, updating its emission in the compiler, and changing its handling in the VM to use the provided index when updating dictionaries on the stack.

Changes

Cohort / File(s) Change Summary
Compiler Emission Update
compiler/codegen/src/compile.rs
The DictUpdate instruction is now emitted with an explicit index: 1 argument instead of no arguments.
Instruction Definition & Usage
compiler/core/src/bytecode.rs
The Instruction::DictUpdate variant is changed from a unit to a struct with an index field; all pattern matches and formatting updated accordingly.
VM Execution Logic
vm/src/frame.rs
The VM now handles DictUpdate with an index argument, using it to determine which stack dictionary to update and merging items from the source mapping.

Sequence Diagram(s)

sequenceDiagram
    participant Compiler
    participant Bytecode
    participant VM
    participant Stack
    participant PyDict

    Compiler->>Bytecode: Emit DictUpdate { index }
    Bytecode->>VM: Execute DictUpdate { index }
    VM->>Stack: Pop source mapping
    VM->>Stack: Access target dict at TOS-(index-1)
    VM->>PyDict: Merge items from source mapping
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

Poem

In bytecode fields where rabbits hop,
A DictUpdate now makes a stop—
With index in paw, it leaps with glee,
To update the stack more cleverly.
Dictionaries merge, mappings unite,
In code’s spring meadow, all feels right!
🐇✨

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@youknowone youknowone changed the title New DictUpdate instruction Renewd DictUpdate instruction Aug 8, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
compiler/codegen/src/compile.rs (1)

3588-3591: Stale comment references old DictUpdate API

The commented-out compile_pattern_mapping stub still shows DictUpdate { size: 2 }. When reviving this code, update it to the new index form to match the VM semantics and the change made above.

vm/src/frame.rs (2)

808-821: Simplify index-to-depth computation (avoid branch, clarify intent)

Use saturating_sub to derive the depth after popping the source. This removes branching and cleanly supports idx==0 or 1 mapping to TOS.

-                let idx = index.get(arg);
-
-                // Pop the source from TOS
-                let source = self.pop_value();
-
-                // Get the dict to update (it's now at TOS-(i-1) after popping source)
-                let dict = if idx <= 1 {
-                    // DICT_UPDATE 0 or 1: dict is at TOS (after popping source)
-                    self.top_value()
-                } else {
-                    // DICT_UPDATE n: dict is at TOS-(n-1)
-                    self.nth_value(idx - 1)
-                };
+                let idx = index.get(arg);
+                // Pop the source from TOS
+                let source = self.pop_value();
+                // After popping, dict is at depth idx.saturating_sub(1):
+                // idx in {0,1} => depth 0 (TOS); idx >= 2 => depth (idx-1)
+                let depth = idx.saturating_sub(1);
+                let dict = self.nth_value(depth);

803-821: Document/align index semantics with bytecode spec

Current code treats idx==0 and idx==1 equivalently (dict at TOS after pop). If the bytecode spec is 1-based (compiler currently emits 1), consider:

  • Explicitly documenting the valid range and meaning in the instruction definition, or
  • Asserting expected range in debug builds for early detection of emitter/VM mismatches.
📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33ea50c and d56fcd0.

📒 Files selected for processing (3)
  • compiler/codegen/src/compile.rs (1 hunks)
  • compiler/core/src/bytecode.rs (3 hunks)
  • vm/src/frame.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • compiler/core/src/bytecode.rs
  • compiler/codegen/src/compile.rs
  • vm/src/frame.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
🔇 Additional comments (5)
compiler/codegen/src/compile.rs (2)

4399-4402: DictUpdate index=1 matches the stack shape; LGTM

At this point TOS is the mapping (from **expr) and the target dict is one below it. Using index: 1 aligns with the updated bytecode and VM semantics.


4399-4402: No remaining old DictUpdate usages found

Ran ripgrep across the repo and only discovered a single commented example of the old { size: … } form (in compiler/codegen/src/compile.rs), so there are no active references to remove or replace.

compiler/core/src/bytecode.rs (2)

1402-1402: Stack effect -1 looks correct

Popping only the source mapping while keeping the target dict on the stack matches the described VM behavior. No changes needed.


1591-1592: Disassembly formatting is consistent

Printing DictUpdate(index) matches formatting used by similar instructions (e.g., CopyItem, Swap). Looks good.

vm/src/frame.rs (1)

836-837: LGTM on merge semantics

Using dict.merge_object(source, vm) delegates to dict’s merge logic and preserves mapping semantics.

Comment on lines +670 to +672
DictUpdate {
index: Arg<u32>,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Clarify DictUpdate index semantics and verify call sites

Good move making DictUpdate carry index: Arg<u32>; naming aligns with CopyItem/Swap.

Please add a short doc comment clarifying:

  • What index is relative to (TOS before pop vs. after popping the source mapping).
  • Whether it’s a 0-based offset and any bounds expectations.

Also, since this changes the bytecode argument shape, ensure marshaled bytecode compatibility is addressed (e.g., version bump/invalidation or explicit handling when index == 0 to preserve legacy behavior).

Run this to find any stale patterns still matching/constructing the unit variant form:

I can draft the doc comment and add unit tests asserting stack effect and disassembly output once you confirm the intended “before/after pop” convention.


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail

echo "All occurrences of DictUpdate with context (sanity check):"
rg -n --hidden --glob '!target' -A3 -B3 '\bDictUpdate\b'

echo
echo "Likely pattern matches not using struct form (manual review recommended):"
rg -n --hidden --glob '!target' -e '\bDictUpdate\s*=>' -e '\bInstruction::DictUpdate\s*[),]|,$'

echo
echo "Constructors not using struct form (look for missing braces):"
rg -n --hidden --glob '!target' -e '\bInstruction::DictUpdate\b(?!\s*\{)' -U

Length of output: 319276


Add documentation for DictUpdate index semantics & ensure marshaling compatibility

Please update compiler/core/src/bytecode.rs around the DictUpdate variant (lines 667–672) as follows:

  • Insert a doc comment explaining that index is a 0-based offset to the dict on the stack before popping the source mapping. For example:
    /// Pop a source mapping from the top of the stack and update
    /// the dict located `index` slots below it. `index` is a 0-based
    /// offset relative to the top-of-stack *before* the pop.
    DictUpdate {
        index: Arg<u32>,
    },
  • Note in the doc that legacy bytecode (with no index field) will decode with index == 0; if you intend to break compatibility, bump the bytecode version or handle index == 0 explicitly in tests.

Next steps:

  • No stale unit-variant constructors were found (grep output shows only struct-form usage), so no further code changes required there.
  • Add unit tests in the bytecode test module asserting:
    • Instruction::DictUpdate { index } yields stack_effect() == -1
    • fmt_dis() prints DictUpdate { index }
  • Add a marshaling test that serializes an old-format DictUpdate (without arg) and ensures it decodes to DictUpdate { index: 0 }.
🤖 Prompt for AI Agents
In compiler/core/src/bytecode.rs around lines 667 to 672, add a doc comment to
the DictUpdate variant explaining that the index field is a 0-based offset to
the dict on the stack before popping the source mapping, clarifying the
semantics as described. Also note in the doc that legacy bytecode without the
index field decodes with index == 0 and mention the need to handle compatibility
accordingly. Then, add unit tests in the bytecode test module to assert that
Instruction::DictUpdate { index } has a stack_effect() of -1 and that fmt_dis()
outputs the expected string. Finally, add a marshaling test that serializes an
old-format DictUpdate (without the index argument) and verifies it decodes to
DictUpdate { index: 0 }.

Comment on lines 826 to 833
if vm
.get_method(other.clone(), vm.ctx.intern_str("keys"))
.get_method(source.clone(), vm.ctx.intern_str("keys"))
.is_none()
{
return Err(vm.new_type_error(format!(
"'{}' object is not a mapping",
other.class().name()
source.class().name()
)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Don't mask attribute lookup errors when checking mapping protocol

vm.get_method(...) returns Option<PyResult<_>>; using is_none() ignores Some(Err(e)) and turns it into a TypeError, hiding the real exception. Propagate lookup errors instead.

-                if vm
-                    .get_method(source.clone(), vm.ctx.intern_str("keys"))
-                    .is_none()
-                {
-                    return Err(vm.new_type_error(format!(
-                        "'{}' object is not a mapping",
-                        source.class().name()
-                    )));
-                }
+                match vm.get_method(source.clone(), vm.ctx.intern_str("keys")) {
+                    Some(Ok(_)) => {}
+                    Some(Err(e)) => return Err(e),
+                    None => {
+                        return Err(vm.new_type_error(format!(
+                            "'{}' object is not a mapping",
+                            source.class().name()
+                        )));
+                    }
+                }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if vm
.get_method(other.clone(), vm.ctx.intern_str("keys"))
.get_method(source.clone(), vm.ctx.intern_str("keys"))
.is_none()
{
return Err(vm.new_type_error(format!(
"'{}' object is not a mapping",
other.class().name()
source.class().name()
)));
match vm.get_method(source.clone(), vm.ctx.intern_str("keys")) {
Some(Ok(_)) => {}
Some(Err(e)) => return Err(e),
None => {
return Err(vm.new_type_error(format!(
"'{}' object is not a mapping",
source.class().name()
)));
}
}
🤖 Prompt for AI Agents
In vm/src/frame.rs around lines 826 to 833, the code uses is_none() on the
result of vm.get_method(...) which returns Option<PyResult<_>>, causing it to
ignore Some(Err(e)) and incorrectly convert it to a TypeError. To fix this,
handle the Option and PyResult separately: first check if the Option is None to
return the TypeError, but if it is Some, propagate any Err by returning it
instead of masking it.

@youknowone youknowone merged commit a9a9e3b into RustPython:main Aug 8, 2025
12 checks passed
@youknowone youknowone deleted the dict-update branch August 8, 2025 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant