[Stack Switching] Add basic support for resume/suspend in the interpreter #7771

kripken · 2025-07-30T20:32:55Z

I realized that we can implement suspend/resume on the current
interpreter with only a small amount of changes, basically using
the same idea as Asyncify: pause and resume structured control
flow by rewinding the stack in a "resume" mode that skips normal
execution. That is, this does not use a program counter (no goto),
nor continuation-passing style, and instead adds interpreter
support to unwind and rewind the stack. This is not the fastest
way to do things, but it is the simplest by far.

The basic idea is the same as in this 2019 blogpost:

https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html

This is quite efficient in the things we care about: suspend/
resume is slow, but normal execution hardly regresses, which is
important to keep the Precompute pass from getting slower.
While the more intrusive experiment #7762 made that pass 2x
slower, this only adds 10-15% overhead. The main reason is that
this pass keeps us down to a single indirect call per instruction,
while e.g. separating decoding and execution - to maintain a
normal value stack - ends up doing two indirect calls. The "trick"
in this PR is that, yes, we do need a value stack (that is the only
practical way to stash the values on the stack when suspending),
but we can populate that stack only when inside a coroutine
(when we might suspend). So we still use the normal way of
getting child instruction values - just calling visit(curr->child)
from the parent's visitFoo() method - and do not lose that
speed, but still have a stack of values when we need it.

(10-15% is still significant, but it is just on a single pass, so it
seems acceptable, and there might be ways to optimize this further.)

Notes:

Flow now has a "suspendTag" property, which denotes the tag we are
suspending, when we suspend.
As part of this change, callFunction in the interpreter returns a Flow,
so that we can propagate suspensions out of functions.
For testing, this adds assert_suspension support in wasm-shell.
For testing, add part of cont.wast from the test suite, and minor
fixes for it in wasm.cpp.
This only adds basic support: resume_throw and various other parts
of the stack switching proposal are TODOs, but I think this PR does the
hard parts.

kripken · 2025-07-30T21:57:55Z

Out of curiosity I measured the overhead of maintaining the value stack, and it is 25%-30% or so. So coroutines that do not suspend/resume will run at around that speed, slightly slower than normal code, but still pretty reasonable. (Actual suspend and resume operations will be where things are actually slow.)

kripken · 2025-07-30T22:08:51Z

And the 10-15% overhead of this PR seems to come from

the larger size of Flow (now supporting a suspend tag)
changes to the main visit() method (even though the branch there is well-predicted if we never run a coroutine, it is a branch I guess), and
other sources (perhaps the changes to control flow structures).

If we really wanted to reduce this, we could probably templatize the core interpreter on Flow and a new SuspendingFlow, and only modify visit() in the latter, etc., but perhaps that would not be worth the code complexity.

src/parser/wast-parser.cpp

src/shell-interface.h

src/tools/execution-results.h

src/wasm-interpreter.h

tlively · 2025-07-31T19:31:17Z

src/wasm/wasm.cpp

+    // This will never be executed and the instruction will not be emitted.
+    // Model this with an uninhabitable cast type.
+    type = cont->type.with(NonNullable);


Resume does not return a continuation, so this is not correct. It should return the result types given by the resumed continuation.

(ditto)

edit: oh, wait, this is something else, sorry

Agreed this is wrong, but perhaps I can leave it to a later followup that gets the upstream spec test passing? All this will be tested at that time fully, I think. (Changes in this PR are just to get some amount of spec tests passing, enough to get to interesting parts.)

It's difficult to tell whether things are intentionally or unintentionally wrong :) Maybe it would be worth getting validation working earlier rather than later? We should be able to validate the spec tests before we can execute them.

Hmm, the thing is that the spec test is not ordered in such a way... which would have been nice.

I am basically just going top-down in that test, to make sure I don't miss anything.

I promise we will pass it all after all followups land! 😄

Coming back to this TODO, I'm not sure how to fix it. Yes, this should return the result of the continuation, but there is no continuation type here - it is null - so what can we do aside from emit an uninhabitable type?

We can take inspiration from CallRef, which has the same problem. When the call target is null, it makes any reference that appears in the existing type uninhabitable, but leaves any non-reference type as-is.

Ok, opened #7794 for that.

src/wasm/wasm.cpp

tlively · 2025-07-31T19:44:20Z

In general I'm not thrilled about the extra complexity here, although I understand that this direction is desirable because of the estimated performance benefits and smaller implementation effort compared to the proposed new interpreter.

I still think the new interpreter is worth working on in the medium to long term because of how much simpler it has the potential to be and because it would scale better to use cases like fuzzing threads, where we really want to be able to execute just one instruction at a time.

Co-authored-by: Thomas Lively <tlively123@gmail.com>

kripken · 2025-07-31T20:25:25Z

Interesting about fuzzing threads, yeah, for that we might want a very different interpreter.

Otherwise, for stack switching I actually think this ended up surprisingly simple? It's possible the new interpreter would be simpler in some ways (like not needing restoredValuesMap) but it will be more complex in others (need a way to continue a walk of the IR "from the middle").

What might be significantly simpler than either could be to use C++20 coroutines, which would save not just execution but values on the stack, entirely automatically. But the refactoring to get there would be significant, I think... And also

I'm not sure how fast it would be
there are some questions about compiler support.

tlively · 2025-07-31T20:37:21Z

The complexity about being able to pick up execution in the middle of the IR should be entirely encapsulated inside a forward iterator over the IR, though. There are several ways of implementing such an iterator, including using C++20 coroutines as explored in #5447. The new interpreter would also look much, much closer to how the semantics are specified and would have much less boilerplate, so I think it would be much easier to maintain and contribute to. It would also save us from having to update arbitrary recursion limits when stack frames change size.

kripken · 2025-07-31T21:06:10Z

The complexity about being able to pick up execution in the middle of the IR should be entirely encapsulated inside a forward iterator over the IR, though. There are several ways of implementing such an iterator, including using C++20 coroutines as explored in #5447.

Agreed. But if we do not use C++20 coroutines for such an iterator, then I think it might end up being more complicated than this PR (the relevant parts of both, I mean; it would be more encapsulated, though, fair point).

The new interpreter would also look much, much closer to how the semantics are specified

Hmm, while closer to the wasm spec's semantics, it would be less of a match for Binaryen IR semantics, which are more structured. So I'm not sure which is best in this codebase.

and would have much less boilerplate, so I think it would be much easier to maintain and contribute to.

Agreed, avoiding the manual visiting logic would be nicer. (That is what I was hoping to achieve in #7762...)

It would also save us from having to update arbitrary recursion limits when stack frames change size.

Agreed.

tlively · 2025-08-01T01:28:37Z

test/lit/exec/cont_simple.wast

+                          (i32.const 0)
+                          (suspend $more)


Might as well write this completely unnested:

i32.const 0 suspend $more i32.eqz suspend $more i32.eqz ...

tlively · 2025-08-01T01:39:39Z

test/lit/exec/cont_simple.wast

+          (suspend $more)
+          (i32.add
+            (i32.const 4)
+            (block (result i32)


No need for this block.

tlively · 2025-08-01T01:42:08Z

test/lit/exec/cont_simple.wast

+            (block (result i32)
+              (suspend $more)
+              (i32.add
+                (block (result i32)


tlively · 2025-08-01T01:43:58Z

test/lit/exec/cont_simple.wast

+  ;; CHECK-NEXT: [LoggingExternalInterface logging 42]
+  ;; CHECK-NEXT: [LoggingExternalInterface logging 3]
+  ;; CHECK-NEXT: [LoggingExternalInterface logging 300]
+  (func $run-multi-locals (export "run-multi-locals")


Maybe move this up to be located next to the original test for a single local?

kripken · 2025-08-01T14:45:29Z

Tests refactored.

kripken added 30 commits July 25, 2025 13:01

wohoo

4ccb81c

exc

9646f9a

faster

ddee5c4

hope

ed3b506

Merge remote-tracking branch 'origin/main' into resumey

7e6b8b8

field

2cf8b9d

start

3d009f7

work

2aa78ee

work

80a805d

add continutation literal

2ec86fd

work

9d8566d

work

06d9a65

work

62700c5

work

3aa2ee5

work

ff216af

work

2de4b00

typo

346848a

simple debug

2bc653b

format

f3485eb

work

7d68e24

work

0eb3346

work

e0367c6

work

77b1d2d

work

86fc6a3

work

9a9bf7c

work

20a59c0

test

9dd600f

work

b98eaa8

work

e1cf473

work

08ad388

kripken added 9 commits July 30, 2025 11:40

work

37b6c75

safe

9155b9c

work

d14259e

work

c9589d9

work

39164a3

work

2cff4a3

work

b823e9a

work

9460649

format

f1069e8

kripken requested a review from tlively July 30, 2025 20:32

tlively reviewed Jul 31, 2025

View reviewed changes

kripken and others added 3 commits July 31, 2025 13:10

Update src/wasm-interpreter.h

39edc5c

Co-authored-by: Thomas Lively <tlively123@gmail.com>

feedback: comment

fe9d157

Update src/wasm-interpreter.h

c143604

Co-authored-by: Thomas Lively <tlively123@gmail.com>

todo

8c7680e

kripken added 3 commits July 31, 2025 14:06

Merge remote-tracking branch 'myself/resumey' into resumey

9269042

comment to clarify

021d90b

fully support assert_suspend

6ee86b7

tlively approved these changes Aug 1, 2025

View reviewed changes

feedback on tests

96f6521

add todos

2d1dbe4

kripken merged commit adabf63 into WebAssembly:main Aug 1, 2025
16 checks passed

kripken deleted the resumey branch August 1, 2025 16:10

[Stack Switching] Add basic support for resume/suspend in the interpreter #7771

[Stack Switching] Add basic support for resume/suspend in the interpreter #7771

Uh oh!

Conversation

kripken commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jul 30, 2025

Uh oh!

kripken commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tlively commented Jul 31, 2025

Uh oh!

kripken commented Jul 31, 2025

Uh oh!

tlively commented Jul 31, 2025

Uh oh!

kripken commented Jul 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken commented Aug 1, 2025

Uh oh!

Uh oh!

Uh oh!

kripken commented Jul 30, 2025 •

edited

Loading

kripken commented Jul 30, 2025 •

edited

Loading

kripken Jul 31, 2025 •

edited

Loading

kripken Jul 31, 2025 •

edited

Loading