Skip to content

PhpSerializer does not comply with Amazon SQS requirements for message body #49844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrblur opened this issue Mar 28, 2023 · 12 comments
Closed

Comments

@mrblur
Copy link

mrblur commented Mar 28, 2023

Symfony version(s) affected

5.4

Description

It is possible to encode a message payload that contains characters forbidden by Amazon SQS. PhpSerializer used by default checks for binary streams and encodes them using base64, but it fails on detecting all characters forbidden by Amazon.

Code: InvalidMessageContents
Message: Invalid binary character '#xB' was found in the message body, the set of allowed characters is #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF].
Type: Sender

How to reproduce

Dispatch a message encoding a string containing illegal control character, like "vertical tab":

// Assuming we have default messenger transport configured for Amazon SQS

class MyMessage
{
    public function __construct(public readonly string $payload) {}
}

// Example using dependency injection
public function __invoke(MessageBusInterface $messageBus) {
  $message = new MyMessage("This is my very random string that happens to contain \x1B illegal character.");
  $messageBus->dispatch($message); // HTTP 400 here
}

Possible Solution

The test done by PhpSerializer seems insuficient:

public function encode(Envelope $envelope): array
    {
        $envelope = $envelope->withoutStampsOfType(NonSendableStampInterface::class);

        $body = addslashes(serialize($envelope));

        if (!preg_match('//u', $body)) {
            $body = base64_encode($body);
        }

        return [
            'body' => $body,
        ];
    }

I would proposed to replace that if statement with:

// (...)
if (preg_match('/[^\x20-\x{D7FF}\xA\xD\x9\x{E000}-\x{FFFD}\x{10000}-\x{10FFFF}]/u', $body)) {

We may also make that test configurable, or just implement own serializer for amazon-sqs-messenger using base64 encoding.

Additional Context

https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_SendMessage.html

@carsonbot
Copy link

Hey, thanks for your report!
There has not been a lot of activity here for a while. Is this bug still relevant? Have you managed to find a workaround?

@carsonbot
Copy link

Could I get an answer? If I do not hear anything I will assume this issue is resolved or abandoned. Please get back to me <3

@mrblur
Copy link
Author

mrblur commented Oct 16, 2023

This is still relevant and there were no proposed path as which way should we go.

@carsonbot carsonbot removed the Stalled label Oct 16, 2023
@ghost
Copy link

ghost commented Oct 16, 2023

You can write your own serializer that will fill Amazon requirements

@mrblur
Copy link
Author

mrblur commented Oct 17, 2023

Haha, I know how to solve it for myself. But that does not solve the bug inside symfony's SQS implementation, does it?

Current implementation is broken as it produces messages that are rejected by SQS. Is it edge case? Given the interest, sur it is. But I still believe this should be fixed.

@ghost
Copy link

ghost commented Oct 17, 2023

But that does not solve the bug inside symfony's SQS implementation

Where do you see bug inside "symfony's SQS implementation"? Package symfony/amazon-sqs-messenger uses SerializerInterface TH. You can inject any implementation you want.

@croensch
Copy link

I would not replace the unicode filter in PhpSerializer as it simply produces unicode or base64 and any transport okay with that will work.
The problem is that the SQS transport / AWS SDK is using XML so it can't work with just unicode. The forbidden letters are defined by the XML standard.
I'd rather add a PhpForXmlSerializer extends PhpSerializer that wraps the encode function with the aforementioned other regex. Then this can be injected into SQS transport in case the default serializer already is PhpSerializer.
I just don't know if this kind of logic is allowed in DI?

@mrblur
Copy link
Author

mrblur commented Oct 18, 2023

@croensch I am pretty sure, there is a default serializer instantiated from inside the sqs transport constructor, so it is pretty basic to replace PhpSerializer as default on this specific implementation.

Thank you for pointing out the XML, it didn't occur to me as XML-related limitation, it makes a lot more sense now. Considering speed, maybe there is already some kind of validator that'd detect invalid characters without regexping the whole message.

@croensch
Copy link

@mrblur yes the sqs transport constructor has a nullable / auto serializer dependency. But in the factory it's not nullable so i guess it actually is injected by the bundle if you use the convenient yaml notation?

@carsonbot
Copy link

Hey, thanks for your report!
There has not been a lot of activity here for a while. Is this bug still relevant? Have you managed to find a workaround?

@jdelaune
Copy link
Contributor

We have run into this issue this week. Are we basically saying it won't be fixed in amazon-sqs-messenger and we have to inject a different serialiser?

@carsonbot carsonbot removed the Stalled label Apr 25, 2024
@VincentLanglet
Copy link
Contributor

VincentLanglet commented May 7, 2024

The implementation of the Serializer is pretty easy https://github.com/symfony/symfony/compare/5.4...VincentLanglet:symfony:amazonSerializer?expand=1

But the issue is the fact the defaultSerializer is injected by this config

->arrayNode('serializer')
->addDefaultsIfNotSet()
->children()
->scalarNode('default_serializer')
->defaultValue('messenger.transport.native_php_serializer')
->info('Service id to use as the default serializer for the transports.')
->end()

$container->setAlias('messenger.default_serializer', $config['serializer']['default_serializer']);

$serializerId = $transport['serializer'] ?? 'messenger.default_serializer';
$transportDefinition = (new Definition(TransportInterface::class))
->setFactory([new Reference('messenger.transport_factory'), 'createTransport'])
->setArguments([$transport['dsn'], $transport['options'] + ['transport_name' => $name], new Reference($serializerId)])
->addTag('messenger.receiver', [
'alias' => $name,
'is_failure_transport' => \in_array($name, $failureTransports, true),
])

And I'm not sure what would be the right way to avoid a BC break.

Any recommendation @nicolas-grekas for such dynamic dependency injection ?
Or could/should it be with decoration by the amazon bridge ?

Comme so far the only idea I have would be to add

if ($serializer::class === PHPSerializer::class) {
     $serializer = new AmazonSqsSerializer();
}

which doesn't looks idea the best code...

Edit: I think I found a solution

fabpot added a commit that referenced this issue Jun 9, 2024
…e body (VincentLanglet)

This PR was squashed before being merged into the 5.4 branch.

Discussion
----------

[Messenger] Comply with Amazon SQS requirements for message body

| Q             | A
| ------------- | ---
| Branch?       | 5.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Issues        | Fix #49844
| License       | MIT

Commits
-------

0d441bf [Messenger] Comply with Amazon SQS requirements for message body
@fabpot fabpot closed this as completed Jun 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants