You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/03/04 15:48:42 UTC

[GitHub] [pulsar] poulhenriksen commented on issue #6451: Doing negative ack after ack timeout breaks redelivery in strange ways

poulhenriksen commented on issue #6451: Doing negative ack after ack timeout breaks redelivery in strange ways
URL: https://github.com/apache/pulsar/issues/6451#issuecomment-594616979

Yes, as a workaround, I could stop using ack timeout, but would then have to implement something similar myself to ensure processing can time out.

Is this something that is hard to fix? Without having looked at the implementation, it seems like the fix would be to not have the negative ack remove the message from the unack message tracker if that message processing has already timed out?

I also did a bit of additional testing to confirm the actual consequences, and it seems there is a difference whether partitioned or non-partitioned topics are used.

If using partitioned topics (with just 1 partition), the consumer gets completely stuck and no further messages will be processed by the consumer (and the message will not be put on the DLQ). That seems like a serious bug. Furthermore, I can consistently reproduce this in a test case, but in a real setup, this is likely to only happen rarely, making it way harder to track down the source of the issue.

If using non-partitioned topics the redelivery-count is just messed up and the message correctly ends up on the DLQ, so that is less serious.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services