Skip to content

Match statements #5485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Jan 25, 2025
Merged

Match statements #5485

merged 27 commits into from
Jan 25, 2025

Conversation

arihant2math
Copy link
Collaborator

@arihant2math arihant2math commented Jan 19, 2025

This pr turns the parsed AST of a match-case statement into bytecode for the VM.

Progress:

  • Add NOP instruction
  • Add SWAP instrution
  • compile_match
  • compile_match_inner
  • compile_pattern
  • compile_pattern_value
  • compile_pattern_singleton
  • compile_pattern_sequence (requires instruction MATCH_SEQUENCE instruction)
  • compile_pattern_mapping
  • compile_pattern_class
  • compile_pattern_star
  • compile_pattern_as
  • compile_pattern_or
  • Add optimized support for default case requires compile_pattern_as first (might be done in a separate pr)
  • Add tests

Resolves: #4770

@youknowone
Copy link
Member

awesome, you already got the key parts

@arihant2math
Copy link
Collaborator Author

It almost finishes codegen, the only problem faced right now is this error:

D:/Documents/Programming/GitHub/RustPython/target/debug/rustpython.exe match_test.py
thread 'main' panicked at compiler\codegen\src\compile.rs:3287:9:
assertion `left == right` failed: switching to completed block
  left: BlockIdx(2)
 right: BlockIdx(4294967295)
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869\library/std\src\panicking.rs:665
   1: core::panicking::panic_fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869\library/core\src\panicking.rs:76
   2: core::panicking::assert_failed_inner
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869\library/core\src\panicking.rs:413
   3: core::panicking::assert_failed<rustpython_codegen::ir::BlockIdx,rustpython_codegen::ir::BlockIdx>
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\panicking.rs:373
   4: rustpython_codegen::compile::Compiler::switch_to_block
             at .\compiler\codegen\src\compile.rs:3287
   5: rustpython_codegen::compile::Compiler::compile_match_inner
             at .\compiler\codegen\src\compile.rs:1997
   6: rustpython_codegen::compile::Compiler::compile_match
             at .\compiler\codegen\src\compile.rs:2012
   7: rustpython_codegen::compile::Compiler::compile_statement
             at .\compiler\codegen\src\compile.rs:771
   8: rustpython_codegen::compile::Compiler::compile_statements
             at .\compiler\codegen\src\compile.rs:501
   9: rustpython_codegen::compile::Compiler::compile_program
             at .\compiler\codegen\src\compile.rs:416
  10: core::ops::function::FnOnce::call_once<enum2$<core::result::Result<tuple$<>,rustpython_parser_core::source_code::LocatedError<enum2$<rustpython_codegen::error::CodegenErrorType> > > > (*)(ref_mut$<rustpython_codegen::compile::Compiler>,ref$<slice2$<enum2$<
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:250
  11: rustpython_codegen::compile::compile_impl<slice2$<enum2$<rustpython_ast::generic::Stmt<rustpython_parser_core::source_code::SourceRange> > >,enum2$<core::result::Result<rustpython_codegen::symboltable::SymbolTable,rustpython_codegen::symboltable::SymbolTab
             at .\compiler\codegen\src\compile.rs:138
  12: rustpython_codegen::compile::compile_program
             at .\compiler\codegen\src\compile.rs:150
  13: rustpython_codegen::compile::compile_top
             at .\compiler\codegen\src\compile.rs:110
  14: rustpython_compiler::compile
             at .\compiler\src\lib.rs:66
  15: rustpython_vm::vm::VirtualMachine::compile_with_opts
             at .\vm\src\vm\compile.rs:26
  16: rustpython_vm::vm::VirtualMachine::compile
             at .\vm\src\vm\compile.rs:16
  17: rustpython_vm::vm::VirtualMachine::run_code_string
             at .\vm\src\vm\compile.rs:61
  18: rustpython_vm::vm::VirtualMachine::run_script
             at .\vm\src\vm\compile.rs:49
  19: rustpython::run_rustpython
             at .\src\lib.rs:206
  20: rustpython::run::closure$0<rustpython::main::closure_env$0>
             at .\src\lib.rs:99
  21: rustpython_vm::vm::interpreter::impl$0::run::closure$0<rustpython::run::closure_env$0<rustpython::main::closure_env$0> >
             at .\vm\src\vm\interpreter.rs:104
  22: rustpython_vm::vm::interpreter::impl$0::enter::closure$0<rustpython_vm::vm::interpreter::impl$0::run::closure_env$0<rustpython::run::closure_env$0<rustpython::main::closure_env$0> >,enum2$<core::result::Result<tuple$<>,rustpython_vm::object::core::PyRef<ru
             at .\vm\src\vm\interpreter.rs:72
  23: core::panic::unwind_safe::impl$25::call_once<enum2$<core::result::Result<tuple$<>,rustpython_vm::object::core::PyRef<rustpython_vm::exceptions::types::PyBaseException> > >,rustpython_vm::vm::interpreter::impl$0::enter::closure_env$0<rustpython_vm::vm::inte
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\panic\unwind_safe.rs:272
  24: std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<rustpython_vm::vm::interpreter::impl$0::enter::closure_env$0<rustpython_vm::vm::interpreter::impl$0::run::closure_env$0<rustpython::run::closure_env$0<rustpython::main::closure_env$0>
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panicking.rs:557
  25: std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<rustpython_vm::vm::interpreter::impl$0::enter::closure_env$0<rustpython_vm::vm::interpreter::impl$0::run::closure_env$0<rustpython::run::closure_env$0<rustpython::main::closure_env$0> >,en
  26: std::panicking::try
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panicking.rs:520
  27: std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<rustpython_vm::vm::interpreter::impl$0::enter::closure_env$0<rustpython_vm::vm::interpreter::impl$0::run::closure_env$0<rustpython::run::closure_env$0<rustpython::main::closure_env$0> >,en
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panic.rs:358
  28: rustpython_vm::vm::thread::enter_vm::closure$0<enum2$<core::result::Result<tuple$<>,rustpython_vm::object::core::PyRef<rustpython_vm::exceptions::types::PyBaseException> > >,rustpython_vm::vm::interpreter::impl$0::enter::closure_env$0<rustpython_vm::vm::in
             at .\vm\src\vm\thread.rs:31
  29: std::thread::local::LocalKey<core::cell::RefCell<alloc::vec::Vec<core::ptr::non_null::NonNull<rustpython_vm::vm::VirtualMachine>,alloc::alloc::Global> > >::try_with<core::cell::RefCell<alloc::vec::Vec<core::ptr::non_null::NonNull<rustpython_vm::vm::Virtual
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\thread\local.rs:283
  30: std::thread::local::LocalKey<core::cell::RefCell<alloc::vec::Vec<core::ptr::non_null::NonNull<rustpython_vm::vm::VirtualMachine>,alloc::alloc::Global> > >::with<core::cell::RefCell<alloc::vec::Vec<core::ptr::non_null::NonNull<rustpython_vm::vm::VirtualMach
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\thread\local.rs:260
  31: rustpython_vm::vm::thread::enter_vm<enum2$<core::result::Result<tuple$<>,rustpython_vm::object::core::PyRef<rustpython_vm::exceptions::types::PyBaseException> > >,rustpython_vm::vm::interpreter::impl$0::enter::closure_env$0<rustpython_vm::vm::interpreter::
             at .\vm\src\vm\thread.rs:28
  32: rustpython_vm::vm::interpreter::Interpreter::enter<rustpython_vm::vm::interpreter::impl$0::run::closure_env$0<rustpython::run::closure_env$0<rustpython::main::closure_env$0> >,enum2$<core::result::Result<tuple$<>,rustpython_vm::object::core::PyRef<rustpyth
             at .\vm\src\vm\interpreter.rs:72
  33: rustpython_vm::vm::interpreter::Interpreter::run<rustpython::run::closure_env$0<rustpython::main::closure_env$0> >
             at .\vm\src\vm\interpreter.rs:104
  34: rustpython::run<rustpython::main::closure_env$0>
             at .\src\lib.rs:99
  35: rustpython::main
             at .\src\main.rs:2
  36: core::ops::function::FnOnce::call_once<std::process::ExitCode (*)(),tuple$<> >
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:250
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

What seems to be happening is the self.switch_to_block(end_block); from the let end_block = self.new_block(); isn't working because of a previous block having been completed. I tried moving instantiation around but it didn't fix it. @youknowone how should I proceed?

@youknowone
Copy link
Member

When I encountered similar issue before, it was related to wrong order of value stack/block stack operations. #5275 was one of them
Could you share match_test.py?

@arihant2math
Copy link
Collaborator Author

arihant2math commented Jan 21, 2025

Currently there is a test in compile.rs called test_match that reproduces the issue (the code is identical).

https://github.com/arihant2math/RustPython/blob/a1586d0089868e87bb8cc887ce5f1f1802341437/compiler/codegen/src/compile.rs#L3728C1-L3742C6

@youknowone
Copy link
Member

Though I don't know well how this blocks are working, switch_to_block in jump_to_fail_pop doesn't look safe.

@youknowone
Copy link
Member

@coolreader18 do you have advices?

@youknowone youknowone marked this pull request as ready for review January 21, 2025 14:52
@youknowone youknowone marked this pull request as draft January 21, 2025 14:53
@youknowone
Copy link
Member

youknowone commented Jan 21, 2025

I merged #5489 for easier debug.
Here is my rough suggestion how to create and switch to blocks:
3720152
As I understand, when any jump might happen, they must be separated as blocks. In match statement, pattern guard can jump to next pattern. Then a pattern need at least one block.

@arihant2math arihant2math force-pushed the match-statements branch 2 times, most recently from a1586d0 to be4df5d Compare January 21, 2025 19:31
@arihant2math
Copy link
Collaborator Author

The cpython implementation also has a hacky way of popping things off the stack (by repeatedly switching to blocks) and I'm not sure that's going to run without panicking anymore

@arihant2math
Copy link
Collaborator Author

arihant2math commented Jan 21, 2025

Seems like codegen almost works, it's currently getting an u32::MAX mixed in somewhere it's not supposed to be however.

Debug printing:

===BLOCK 0===
LoadConst("one"): 0 +1 => 1
StoreLocal(0, v): 1 -1 => 0
LoadNameAny(0, v): 0 +1 => 1
Duplicate: 1 +1 => 2
LoadConst("one"): 2 +1 => 3
CompareOperation(Equal): 3 -1 => 2
Pop: 2 -1 => 1
LoadConst("two"): 1 +1 => 2
StoreLocal(0, v): 2 -1 => 1
Jump(5): 1 +0 => 1
===BLOCK 5===
index out of bounds: the len is 7 but the index is 4294967295
thread 'compile::tests::test_match' panicked at compiler\codegen\src\ir.rs:311:28:
index out of bounds: the len is 7 but the index is 4294967295
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869\library/std\src\panicking.rs:665
   1: core::panicking::panic_fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869\library/core\src\panicking.rs:76
   2: core::panicking::panic_bounds_check
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869\library/core\src\panicking.rs:281
   3: rustpython_codegen::ir::stackdepth_push
             at .\src\ir.rs:311
   4: rustpython_codegen::ir::CodeInfo::max_stackdepth
             at .\src\ir.rs:277
   5: rustpython_codegen::ir::CodeInfo::finalize_code
             at .\src\ir.rs:88
   6: rustpython_codegen::compile::Compiler::pop_code_object
             at .\src\compile.rs:365
   7: rustpython_codegen::compile::tests::compile_exec
             at .\src\compile.rs:3675
   8: rustpython_codegen::compile::tests::test_match
             at .\src\compile.rs:3737
   9: rustpython_codegen::compile::tests::test_match::closure$0
             at .\src\compile.rs:3736
  10: core::ops::function::FnOnce::call_once<rustpython_codegen::compile::tests::test_match::closure_env$0,tuple$<> >
             at C:\Users\ariha\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:250
  11: core::ops::function::FnOnce::call_once
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869\library/core\src\ops\function.rs:250
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

@arihant2math
Copy link
Collaborator Author

Seems like all the blocks target u32::MAX as their next block. I modified them to point at the next block as their target and string together that way.

@arihant2math
Copy link
Collaborator Author

Currently the output is as follows:

===BLOCK 0===
LoadConst(0): 0 +1 => 1
LoadConst(None): 1 +1 => 2
ImportName(0, sys): 2 -1 => 1
StoreLocal(0, sys): 1 -1 => 0
LoadNameAny(0, sys): 0 +1 => 1
LoadAttr(1, path): 1 +0 => 1
LoadMethod(2, insert): 1 +2 => 3
LoadConst(0): 3 +1 => 4
LoadConst(""): 4 +1 => 5
CallMethodPositional(2): 5 -4 => 1
Pop: 1 -1 => 0
ReturnConst(None): 0 +0 => 0
DONE: 5
created pattern_blocks: BlockIdx(1) - BlockIdx(5)(end block)
linking: 1 -> 2
linking: 2 -> 3
linking: 3 -> 4
linking: 4 -> 5
linking: 5 -> 7
0
0: 4294967295 28
1: 2 0
2: 3 0
3: 4 0
4: 5 0
5: 7 0
6: 4294967295 0
7: 4294967295 0
switch_to_block BlockIdx(0) -> BlockIdx(7)
block: 1 -> 2
block: 2 -> 3
block: 3 -> 4
block: 4 -> 5
block: 5 -> 7
block: 7 -> 4294967295
===BLOCK 0===
LoadConst("one"): 0 +1 => 1
StoreLocal(0, v): 1 -1 => 0
LoadNameAny(0, v): 0 +1 => 1
Duplicate: 1 +1 => 2
LoadConst("one"): 2 +1 => 3
CompareOperation(Equal): 3 -1 => 2
Pop: 2 -1 => 1
LoadConst("two"): 1 +1 => 2
StoreLocal(0, v): 2 -1 => 1
Jump(5): 1 +0 => 1
===BLOCK 5===
===BLOCK 7===
ReturnConst(None): 1 +0 => 1
DONE: 3
[vm\src\frame.rs:2103:9] self = ExecutingFrame {
    code: code: <code object <module> at ??? file "match_test.py", line 1>,
    state: FrameState {
        stack: [
            [PyObject PyStr { value: "one", kind: Ascii, hash: -1 }],
            [PyObject PyStr { value: "one", kind: Ascii, hash: -1 }],
            [PyObject PyStr { value: "one", kind: Ascii, hash: -1 }],
        ],
        blocks: [],
        lasti: 5,
    },
}
thread 'main' panicked at vm\src\frame.rs:522:22:
tried to push value onto stack but overflowed max_stackdepth
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: process didn't exit successfully: `target\debug\rustpython.exe match_test.py` (exit code: 101)

for

# match_test.py
v = "one"
match v:
    case "one":
        v = "two"
    case "two":
        v = "three"
    case "three":
        v = "one"
    case _:
        v = "one"

@youknowone
Copy link
Member

If you'd like to investigate how stackdepth work, trying this issue probably be helpful #5154

@arihant2math
Copy link
Collaborator Author

I think I will try solving this by carefully checking the generated instructions

@arihant2math
Copy link
Collaborator Author

Expression: compile_exec(
    r#"\
match v:
    case "one":
        v = "two"
    case "two":
        v = "three"
    case "three":
        v = "one"
    case _:
        v = "one"
"#,
)
───────────────────────────────────────────────────────────────────────────────
+new results
────────────┬──────────────────────────────────────────────────────────────────
          0 │+  2     >>    0 LoadNameAny          (0, v)
          1 │+              1 Duplicate
          2 │+
          3 │+  3           2 LoadConst            ("one")
          4 │+              3 CompareOperation     (Equal)
          5 │+              4 Pop
          6 │+
          7 │+  4           5 LoadConst            ("two")
          8 │+              6 StoreLocal           (0, v)
          9 │+              7 Jump                 (0)
         10 │+              8 Duplicate
         11 │+
         12 │+  5           9 LoadConst            ("two")
         13 │+             10 CompareOperation     (Equal)
         14 │+             11 Pop
         15 │+
         16 │+  6          12 LoadConst            ("three")
         17 │+             13 StoreLocal           (0, v)
         18 │+             14 Jump                 (0)
         19 │+             15 Duplicate
         20 │+
         21 │+  7          16 LoadConst            ("three")
         22 │+             17 CompareOperation     (Equal)
         23 │+             18 Pop
         24 │+
         25 │+  8          19 LoadConst            ("one")
         26 │+             20 StoreLocal           (0, v)
         27 │+             21 Jump                 (0)
         28 │+             22 Pop
         29 │+
         30 │+ 10          23 LoadConst            ("one")
         31 │+             24 StoreLocal           (0, v)
         32 │+             25 Jump                 (0)
         33 │+             26 ReturnConst          (None)
────────────┴──────────────────────────────────────────────────────────────────

the issue is that everything is jumping to line 0 for no reason (see line 21, line 9, and line 32 of the bytecode)

Signed-off-by: Ashwin Naren <arihant2math@gmail.com>
@arihant2math
Copy link
Collaborator Author

Ok I rewrote some parts of it and it works for basic match statements now.

@youknowone do I have to pop stuff from the stack when I'm done with it?
Currently I'm just doing JumpIfFalse directly after the comparison without popping anything.

Signed-off-by: Ashwin Naren <arihant2math@gmail.com>
@youknowone
Copy link
Member

If there are redundant stack elements for other purpose, yes, I guess so.
But every instruction has its own expectation for stack changes. So when everything goes well and my changes were correct about stack, they just fit easy. where do you find the redundant elements?

Signed-off-by: Ashwin Naren <arihant2math@gmail.com>
Signed-off-by: Ashwin Naren <arihant2math@gmail.com>
Signed-off-by: Ashwin Naren <arihant2math@gmail.com>
@arihant2math arihant2math marked this pull request as ready for review January 23, 2025 08:11
@arihant2math
Copy link
Collaborator Author

Due to the other functions being much more complicated than expected I think we can let this PR be merged for now and I can follow up with more PRs that fix the unimplemented sections.

Signed-off-by: Ashwin Naren <arihant2math@gmail.com>
@arihant2math
Copy link
Collaborator Author

If there are redundant stack elements for other purpose, yes, I guess so. But every instruction has its own expectation for stack changes. So when everything goes well and my changes were correct about stack, they just fit easy. where do you find the redundant elements?

See compile_pattern_inner and compile_pattern_value, the values are not popped at the end.

Signed-off-by: Ashwin Naren <arihant2math@gmail.com>
@youknowone
Copy link
Member

I agree to merge it. It compiles without panic. It is already a big progress

@youknowone youknowone force-pushed the match-statements branch 2 times, most recently from 64f439f to ed55da1 Compare January 25, 2025 13:55
@youknowone youknowone merged commit 396a0ca into RustPython:main Jan 25, 2025
11 checks passed
@youknowone
Copy link
Member

@arihant2math I removed unused nop, swap op in this merge. Please cherry-pick this commit to get it back

youknowone@64f4d7f

@arihant2math arihant2math deleted the match-statements branch February 3, 2025 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SyntaxError: match expression is not implemented yet at line 1 column 1
2 participants