-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
StepFunctions: Stateless evaluation of EvalComponents #12068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LocalStack Community integration with Pro 2 files 2 suites 35m 2s ⏱️ Results for commit 2ec13c3. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplification of the heap memory model is amazing and greatly improves maintainability 💯
The changes look good to me. Small suggestion: Adding documentation that motivates/explains the argument abstraction would improve understandability and maintainability.
from localstack.services.stepfunctions.asl.utils.json_path import extract_json | ||
|
||
|
||
class Argument(EvalComponent, abc.ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc: A docstring explaining the motivation behind the argument abstraction would be very helpful here.
env.stack.append(argument) | ||
|
||
|
||
class ArgumentLiteral(Argument): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These Argument* implementations seem identical apart from inheriting the new Argument abstraction.
I think they are easier to understand them grouped here rather than spread across individual files 👍
@@ -87,18 +87,11 @@ def eval(self, env: Environment) -> None: | |||
def _eval_body(self, env: Environment) -> None: | |||
try: | |||
while env.is_running(): | |||
# Store the heap values at this depth for garbage collection. | |||
heap_values = set(env.heap.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very nice simplification ✨ 👏
Do I understand correctly that we now have just one global heap instead of heap_values
that only applied at a certain level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior to these changes, the heap was shared globally, with scopes enforced through UUID prefixes attached to value identifiers. This approach worked because we never encountered a use case where accessing values from an outer scope was necessary. However, it led to a significant issue: UUID prefixes were assigned to each EvalComponent
on demand but needed to remain fixed to allow consistent retrieval during iterative scenarios, such as retry counters in retry blocks.
This setup was problematic since the same EvalComponent
could execute on multiple threads (e.g., in distributed Map or Parallel operations). As a result, separate threads could inadvertently read and write each other’s data.
Given that outer scope access wasn’t required, the changes instead introduce a new heap reference for each scope. In the implementation, we extend the concept of scope (related to memory access) to that of a frame, which also includes other SFN-related data such as execution events, logging, etc. Hence the updates made to as_inner_frame_of
.
@@ -467,7 +468,7 @@ def test_start_execution_sync_delegate_timeout( | |||
lambda_creation_response = create_lambda_function( | |||
func_name=function_name, | |||
handler_file=TT.LAMBDA_WAIT_60_SECONDS, | |||
runtime="python3.9", | |||
runtime=Runtime.python3_12, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for unifying these, potentially saving us some image-pulling time 🚄
Motivation
In the StepFunctions v2 interpreter, Component and EvalComponent objects must implement a stateless evaluation of their logic. This model is in place to ensure that components are not only reusable, but also usable in parallel. Such feature is useful especially in Map and Parallel states which apply the same logic multiple times and concurrently.
By closing the following issue, on one hand we did ensure nested distributed map states could be evaluated, but on the other hand we are now in a position to expose all implementation error related to the stateless fashion of evaluation components. This may lead to users experiencing inconsistencies in the evaluation of intrinsic functions or the retry logic within MapRun or parallel states.
These changes further ensure that the evaluation of all EvalComponents are stateless: heap scope strategies, caching of evaluation values in the component object, and shared member values; are moved to the scope of each eval call.
Changes
MaxAttempts
andBackoffRate
)