Skip to content

StepFunctions: Support for Escape Sequences in JSONata String Literals #12072

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 21, 2025

Conversation

MEPalma
Copy link
Contributor

@MEPalma MEPalma commented Dec 24, 2024

Motivation

The SFN v2 interpreter currently preserves escape sequences (like \n, \t) as raw character sequences in string literals instead of converting them to actual characters, which leads to errors such as in #12061. When these strings are passed to the Java library through Python's string representation, the escape sequences get double-escaped ("Hello\nWorld" -> "Hello\\nWorld"). This affects JSONata expression evaluation where escaped characters are expected to be properly interpreted.

Changes

  • the preprocessing of JSONata string literals now properly interprets these escape sequences into their actual character representations
  • related refectoring to string preprocessing
  • added relevant snapshot test

@MEPalma MEPalma added this to the 4.1 milestone Dec 24, 2024
@MEPalma MEPalma self-assigned this Dec 24, 2024
@MEPalma MEPalma added the semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases label Dec 24, 2024
Copy link

github-actions bot commented Dec 24, 2024

LocalStack Community integration with Pro

    2 files      2 suites   33m 29s ⏱️
1 390 tests 1 319 ✅ 71 💤 0 ❌
1 392 runs  1 319 ✅ 73 💤 0 ❌

Results for commit 4698d8b.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@gregfurman gregfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I found an edge case where setting the value of the string literal to a double-quotation mark " is not handled properly. Unsure of how many users will encounetr this so happy to get this in and fix in a follow-up.

def from_string_literal(parser_rule_context: ParserRuleContext) -> Optional[str]:
string_literal = parser_rule_context.getText()
if string_literal.startswith('"') and string_literal.endswith('"'):
string_literal = string_literal[1:-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably an edge-case but I got a failure when trying to compare the single quote string '"' where the resulting value was not escaped correctly.

Snippet of step function (FYI I snapshot tested this case against AWS):

...
    "Pass": {
      "Type": "Pass",
      "Output": {
        "str_value": "\"",
      },
      "Next": "Choice"
    },
    "Choice": {
      "Type": "Choice",
      "Choices": [
        {
          "Next": "Success",
          "Condition": "{% $states.input.str_value = '\"' %}"
        }
      ],
      "Default": "Fail"
    },
...

Result from test:

	�[32m(+)�[0m /events/[6]/executionFailedEventDetails ( {} )
	�[33m(~)�[0m /events/[3]/stateEnteredEventDetails/input/str_value '"' → '\\"' ... (expected → actual)
	�[33m(~)�[0m /events/[5]/stateEnteredEventDetails/input/str_value '"' → '\\"' ... (expected → actual)
	�[33m(~)�[0m /events/[5]/stateEnteredEventDetails/name 'Success' → 'Fail' ... (expected → actual)
	�[33m(~)�[0m /events/[2]/stateExitedEventDetails/output/str_value '"' → '\\"' ... (expected → actual)
	�[33m(~)�[0m /events/[4]/stateExitedEventDetails/output/str_value '"' → '\\"' ... (expected → actual)
	�[31m(-)�[0m /events/[6]/stateExitedEventDetails ( {'name': 'Success', 'output': {'str_value': '"'}, 'outputDetails': {'truncated': False}} )
	�[33m(~)�[0m /events/[5]/type 'SucceedStateEntered' → 'FailStateEntered' ... (expected → actual)
	�[33m(~)�[0m /events/[6]/type 'SucceedStateExited' → 'ExecutionFailed' ... (expected → actual)
	�[31m(-)�[0m /events/[7] ( {'executionSucceededEventDetails': {'output': {'str_value': '"'}, 'outputDetails': {'truncated': False}}, 'id': 8, 'previousEventId': 7, 'timestamp': 'timestamp', 'type': 'ExecutionSucceeded'} )
	�[32m(+)�[0m /events/[7] ( not present )

Perhaps an issue with ast.literal_eval(f'"{string_literal}"')?

Copy link
Contributor Author

@MEPalma MEPalma Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gregfurman!

Indeed, this PR limits the fix to JSONata expressions, however similar parsing issues may still occur for other string literals. In your example the issue is actually the parsing of Output.str_value.

We should definately address this, but in future changes, as updating the parsing logic to all string literals will have many repercussions on other areas of the interpreter, such as intrinsic functions (which are parsed separately).

@MEPalma MEPalma merged commit 2a9fdec into master Jan 21, 2025
31 checks passed
@MEPalma MEPalma deleted the MEP-sfn-fix_jsonata_escape_seq branch January 21, 2025 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants