Skip to content

[Messenger] Show exceptions after multiple retries #32341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 23, 2019
Merged

[Messenger] Show exceptions after multiple retries #32341

merged 1 commit into from
Oct 23, 2019

Conversation

TimoBakx
Copy link
Member

@TimoBakx TimoBakx commented Jul 3, 2019

Q A
Branch? 4.3
Bug fix? yes
New feature? no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets #32311
License MIT
Doc PR n/a

After retrying a failed message, the RedeliveryStamp looses it's exception information. This PR will remedy that.

@OskarStark
Copy link
Contributor

OskarStark commented Jul 3, 2019

Could you please ensure this behavior by a test?

@nicolas-grekas nicolas-grekas added this to the 4.3 milestone Jul 3, 2019
@TimoBakx
Copy link
Member Author

TimoBakx commented Jul 3, 2019

Ofcourse. Is the direction of this fix the right one? As in, am I not doing something I shouldn't? ;-)
If so, I'll make sure to add the tests tomorrow.

@weaverryan
Copy link
Member

Thanks for fixing this @TimoBakx! I'd like to discuss an alternative. As you know, the exception info isn't fully lost, it's just that the "top" RedeliveryStamp doesn't contain it... and then the "failed" console commands only look at the top RedeliveryStamp for that exception info.

I'd like to discuss two possible alternative approaches:

  1. Improve the code in the console command - i.e. loop over all RedeliveryStamps and find the first that has some exception details. From a high level, the command would be displaying the "most recent exception info"... which may or may not be from the top-level stamp.

  2. Stop putting the exception info on the RedeliveryStamp and instead put it on some new stamp - e.g. FailedMessageErrorDetailsStamp. Then, naturally, the exception info wouldn't get "covered" up when it's redelivered later.

Let me know what you think!

@TimoBakx
Copy link
Member Author

TimoBakx commented Jul 8, 2019

Hi @weaverryan, thanks for the feedback. I have to agree that the current "solution" is a bit icky. I've been thinking about it for the past few days.
I really like the idea of the FailedMessageErrorDetailsStamp. But I think we should add one every time an exception occurs during handling and not only after we run out of retries. That way, you also have a better view of what went wrong the previous times (as the Exception might be a different one). What do you think?

@weaverryan
Copy link
Member

I really like the idea of the FailedMessageErrorDetailsStamp. But I think we should add one every time an exception occurs during handling and not only after we run out of retries. That way, you also have a better view of what went wrong the previous times (as the Exception might be a different one). What do you think?

I was also wondering about the same thing.... it seems a bit arbitrary to store the exception only on the one failure that triggered it to be sent to the failure transport. My concern would be message size. Ping @Tobion - do you have any concerns about attaching exception information each time a message fails? The exception message & class are small... but the stack trace is big.

@TimoBakx
Copy link
Member Author

TimoBakx commented Jul 9, 2019

Oh, indeed. That could become quite big quite fast. Would storing the Handler FQCN, the Exception FQCN and the Exception message be enough?
Logging would generally give you the same information, and if you want to start debugging, there are better ways than going through plain-text tracelogs from stamps.

@TimoBakx TimoBakx changed the title [Messenger] Keep exceptions on retry [WIP] [Messenger] Keep exceptions on retry Jul 9, 2019
@weaverryan
Copy link
Member

Would storing the Handler FQCN, the Exception FQCN and the Exception message be enough?

Yes and no. I like seeing the full stracktrace... at least for one of the failures. I'm not sure if it's really "natural" to store the full stacktrace just for the first... and then not for any of the others... but this is also an edge-case (failing, in general is an edge case.

To get this PR to review stage, let's do this new FailedMessageErrorDetailsStamp where we store those details on every failure. And let's include the same stuff we have now (including the full trace) EVERY time. Then we can get other eyes on this and see if size is really a problem or if we're making a fuss out of nothing.

Thanks!

@weaverryan
Copy link
Member

And, btw, this would go into 4.4. However, we could additionally do solution 1 (#32341 (comment)) in 4.3, as it would be a very simple, and sensible patch.

@TimoBakx
Copy link
Member Author

Alright. Thanks for your feedback and insight. I will implement solution 1 in this PR and then create a new PR for solution 2.

@iosifch
Copy link

iosifch commented Jul 21, 2019

I think that the size of the message becomes an issue if the number of retries is too big, but this should be a concern of the developer. Maybe a warning in the documentation will help the developers to know what to expect if the number of retries is too high.

@Tobion
Copy link
Contributor

Tobion commented Jul 26, 2019

I thought the failed:retry discards messages when they fail again as discussed in #32559 (comment). So I was wondering about this PR/issue as it didn't make sense to me. Turns out they are retried but using the retry strategy of the failed transport (which is enabled by default).

Here is the current workflow:

messenger:consume async -> "async" transport message received by worker -> exception in handler -> retry strategy from the "async" transport

  • retry: ack and send the message again to the "async" transport -> received after delay again
  • no retry: SendFailedMessageToFailureTransportListener -> add SentToFailureTransportStamp and RedeliveryStamp and redispatch to send to the "failed" transport

messenger:failed:retry -> get message -> exception in handler -> retry strategy from the "failed" transport

  • retry: ack and send the message again to the "failed" transport with delay which also creates a new id -> messenger:failed:retry command stops because it only receives one message
  • no retry: message gets rejected

@TimoBakx TimoBakx changed the title [WIP] [Messenger] Keep exceptions on retry [Messenger] Show exceptions after multiple retries Aug 3, 2019
@TimoBakx
Copy link
Member Author

TimoBakx commented Aug 3, 2019

I updated this PR to reflect part one of the solution (#32341 (comment))
Part two of the solution is available here: #32904.

@Tobion
Copy link
Contributor

Tobion commented Sep 17, 2019

Seems like a different solution was merged: #33600

@Tobion
Copy link
Contributor

Tobion commented Oct 23, 2019

@Tobion - do you have any concerns about attaching exception information each time a message fails?

@weaverryan yes that is problematic as I have found out: #34082

@TimoBakx
Copy link
Member Author

Would it be better to only store the exception that last occurred? Or perhaps only the trace of the last exception (and keep only the type & message of the previous ones)?

@Tobion
Copy link
Contributor

Tobion commented Oct 23, 2019

I would suggest we only add the new exception info when the exception message and class is different from the previous one. So when you have the same error, it does not re-add the same information over and over. And we also need to limit it to something like the last three different exceptions and drop older error details.
I guess this is something to consider for #32904

@TimoBakx
Copy link
Member Author

Thanks for the feedback. I added two todo-items in #32904.

Copy link
Member

@weaverryan weaverryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one change!

Tobion added a commit that referenced this pull request Oct 23, 2019
…e is dropped (Tobion)

This PR was merged into the 4.3 branch.

Discussion
----------

Revert "[Messenger] Fix exception message of failed message is dropped

| Q             | A
| ------------- | ---
| Branch?       | 4.3
| Bug fix?      | yes
| New feature?  | no <!-- please update src/**/CHANGELOG.md files -->
| Deprecations? | no <!-- please update UPGRADE-*.md and src/**/CHANGELOG.md files -->
| Tickets       |
| License       | MIT
| Doc PR        |

This reverts #33600 because it makes the message grow for each retry until AMQP cannot handle it anymore. On each retry, the full exception trace is added to the message. So in our case on the 5th retry, the message is too big for the AMQP library to encode it. AMQP extension then throws the exception

> Library error: table too large for buffer

(ref. alanxz/rabbitmq-c#224 and php-amqp/php-amqp#131) when trying to publish the message.

To solve this, I suggest to revert #33600 (this PR) and merge #32341 instead which does not re-add the exception on each failure.

Btw, the above problem causes other problematic side-effects of Symfony messenger. As the new retry message fails to be published with an exception, the old (currently processed message) also does not get removed (acknowledged) from the delay queue. So rabbitmq redelivers the message and the same thing happens forever. This can block the consumers and have a huge toll on your service. That's just another case for #32055 (comment). I'll try to fix this in another PR.

Commits
-------

3dbe924 Revert "[Messenger] Fix exception message of failed message is dropped on retry"
@Tobion
Copy link
Contributor

Tobion commented Oct 23, 2019

Thank you @TimoBakx.

Tobion added a commit that referenced this pull request Oct 23, 2019
This PR was merged into the 4.3 branch.

Discussion
----------

[Messenger] Show exceptions after multiple retries

| Q             | A
| ------------- | ---
| Branch?       | 4.3
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #32311
| License       | MIT
| Doc PR        | n/a

After retrying a failed message, the `RedeliveryStamp` looses it's exception information. This PR will remedy that.

Commits
-------

598bd92 [Messenger] Show exceptions on after empty fails
@Tobion Tobion merged commit 598bd92 into symfony:4.3 Oct 23, 2019
@TimoBakx
Copy link
Member Author

Thank you @Tobion and @weaverryan for your feedback.

I'll continue work on #32904 as soon as possible.

@fabpot fabpot mentioned this pull request Nov 1, 2019
@TimoBakx TimoBakx deleted the messenger-keep-exceptions-on-retry branch February 9, 2020 15:34
fabpot added a commit that referenced this pull request Oct 2, 2020
This PR was merged into the 5.2-dev branch.

Discussion
----------

[Messenger] Added ErrorDetailsStamp

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | yes
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #32311
| License       | MIT
| Doc PR        | No doc changes are needed

#SymfonyHackday

This PR is part of the work started in #32341. That PR has a workaround for showing exceptions added to a previous retry. This PR stores error messages in a separate stamp, so they're more easily accessed.

I also added the exceptionClass as a separate string (independant of FlattenException), so that information is always available, even if the trace is not (due to FlattenException not being available).

Duplicated exceptions (compared to the last one) are not stored separately.

**Questions:**
- Should we limit the total amount of exceptions (remove the oldest when adding a new one)?
  - Yes, but in a new PR
- The current implementation adds this stamp in the Worker instead of the listeners to prevent duplicate code (due to the immutability of the envelope in the event). Is this the proper way to do this?
  - No, create a new listener and a way to add stamps to the envelope inside the event.
- When adding details of a `HandlerFailedException`, should we add a stamp per wrapped `Throwable`? There can be multiple errors wrapped by a single `HandlerFailedException`.
  - Yes, but in a later PR

**Todo:**
- [x] only add new information if it differs from the previous exception
- [x] add deprecations
- [x] fall back to old stamp data if no new stamp is available
- [x] rebase and retarget to master
- [x] update CHANGELOG.md
- [x] check for docs PR

**BC Breaks:**
When this PR is merged, RedeliveryStamps will no longer be populated with exception data. Any implementations that use `RedeliveryStamp::getExceptionMessage()` or `RedeliveryStamp::getFlattenedException()` will receive an empty string or `null` respectively for stamps added after this update. They should rely on `ErrorDetailsStamp` instead.

**New stuff:**
- New stamp `ErrorDetailsStamp`.
- New event subscriber `AddErrorDetailsStampListener`.
- New method `AbstractWorkerMessageEvent::addStamps`.

Commits
-------

cd27b86 [Messenger] Added FailedMessageErrorDetailsStamp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants