Skip to content

Conversation

alexrashed
Copy link
Member

This PR slightly adjusts the special message encoding for SQS which has been introduced with #5298.

With #5298, an SQS specific serializer has been introduced which performs a special escaping logic for the node text content in the XML responses. This is only done by SQS, and since it's a very rare practice, it's not supported by any popular XML serialization libraries.

#8002 raised the issue that this behavior causes issues when these escaping sequences are used in the sent message itself.
A message """" sent would be received as """".

This was due to the nature of the initial implementation:

  • These special characters have been escaped before the ElementTree serialization was performed: """" ==> """"
  • The escape sequence itself contains characters which are escaped in XML (specifically &): " ==> """"
  • A post-serialization step removed the double-escaping: """" ==> """"
  • The client performs the unescape: """" ==> """"

With this PR, the special escaping is done after the XML node is serialized:

  • The special characters in the text are "marked" (which is just another form of internal escaping): """" ==> __marker__"__marker__""__marker__"__marker__
    • The "marking" is necessary to identify exactly those special characters within the completely serialized message, without taking any other occurrences of these special characters into account.
  • The message is serialized by the XML serializer: __marker__"__marker__""__marker__"__marker__ ==> __marker__"__marker__""__marker__"__marker__
  • A post-serialization step replaces the marked special characters with their escape sequence: __marker__"__marker__""__marker__"__marker__ ==> """"
  • The client performs the unescape: """" ==> """"

This approach now would not work if the "marked" / internally escaped representation of the character is used in the raw message. f.e. a message "__marker__"__marker" would be received as """.
However, the only approach I can think of right now to properly implement this would be to avoid the specific pre- and post-processing and this could only be done by fixing the serialization itself (i.e. implementing our own XML serialization).
I'm happy for any feedback or ideas!

Fixes #8002.

Copy link
Member

@thrau thrau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is both practical and ridiculous. love it 💯

In seriousness though: I think this is the solution we had once discussed as a potential workaround and it seems fine for now. I also don't see a way around something like this right now without putting significant effort into our own serialization approach.

so LGTM 👍

my only comment would be to perhaps add a few more tests with different variants of strings so we're guarded against regressions, and that can maybe also document limitations. but not necessary in this PR imo 👍

@github-actions
Copy link

LocalStack Community integration with Pro

1 911 tests   1 706 ✔️  1h 12m 8s ⏱️
       2 suites     205 💤
       2 files           0

Results for commit edb508e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: " when sending replaced by quotes when receiving
2 participants