You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/11/14 08:34:00 UTC

[GitHub] [pulsar] shubham-Shole4ever created a discussion: Support for long running message consumer

GitHub user shubham-Shole4ever created a discussion: Support for long running message consumer

**Is your feature request related to a problem? Please describe.**
The ackTimeout is set at the consumer level and is valid for all the messages that consumer handles. We have a case where the consumption of a message takes an unpredictable amount of time, ranging from 10 mins to couple hours. We also don't want to set the ackTimeout for the messages to be max possible (which could be half a day or more).
Can we have a feature where the consumer can send back a signal to the broker, acknowledging that its not failed but currently working on the received message, and the broker extends the ackTimeout for that message.

**Describe the solution you'd like**
A functionality which allows the consumer to notify the broker that it is working on the received message. The broker, on receiving this signal can extend the ackTimeout for that particular message (probably refreshing the ackTimeout)

**Describe alternatives you've considered**
Currently, there is no way to modify the ackTimeout for a particular message. The ackTimeout is set at the consumer level and cannot be modified for any message.


GitHub link: https://github.com/apache/pulsar/discussions/18456

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] merlimat added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user merlimat added a comment to the discussion: Support for long running message consumer

@shubham-Shole4ever When a consumer crashes, or the TCP connection is broken, the messages that were delivered to this consumer and not acked, will be replayed to another available consumer (in case of shared subscriptions) or next time the consumer reconnects. 

You don't need ack timeout for that.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133179

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] shubham-Shole4ever added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user shubham-Shole4ever added a comment to the discussion: Support for long running message consumer

@codelipenghui 
But if my application crashes while it is processing the message, it'll never be able to ack/negative ack that message ever. This'll result in that message never being retried. This is the exact scenario why I cannot ditch the ackTimeout as well.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133178

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] harissecic added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user harissecic added a comment to the discussion: Support for long running message consumer

Stumbled upon this looking for another answer. However in case it helps anyone I'll leave a comment. For such cases I guess it's possible to combine DLQ with ackTimeout. Default value I use is 3. Although I don't use auto-ack I guess it will still work the same way. If message times-out 3 times (in this case) it will automatically go to Dead Letter Queue. This will prevent it to loop endlessly between services. My example is that I'm building up a module for a framework. Now in such case I don't do ackTimeout but let users set it if they want to. However, I do by default set 3 retries before DLQ. Reason was personal experience where it endlessly looped my test message to the shared consumers and I got error logs all the time and couldn't figure it out. Then I realised well message is simply getting negativeAck from each consumer and then redelivered all the time but funny thing is it was malformed JSON message so consumers were doomed to crash (validations falied for previously nulla
 ble thing in kotlin that I moved to non-null). When I set up DLQ to 3 I had some messages fail and then get re-read due to timeout for ack. But combining DLQ, ackTimeout, and shared consumers I think you can set timeout pretty low if processing data takes less time and you do manual ACK as soon as it's done.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133182

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] shubham-Shole4ever added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user shubham-Shole4ever added a comment to the discussion: Support for long running message consumer

@codelipenghui I had a look at the negative-acknowledgement. This will still not work if my ackTimeout is set to 10 mins and the message I am consuming is taking 30 mins (for e.g.). The broker will resurface the message after 10 mins, even though one of the consumer is still working on it. 
I want to avoid this scenario. My proposal is to have something like a "working(messageId)" functionality on the consumer, which notifies the broker not to timeout (and resurface) the message, but rather extend/refresh the ackTimeout set for the concerned messageId.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133175

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun added a comment to the discussion: Support for long running message consumer

Closed as answered.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133183

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] benbro added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user benbro added a comment to the discussion: Support for long running message consumer

There is still no good solution for retrying long running jobs.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133184

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] codelipenghui added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user codelipenghui added a comment to the discussion: Support for long running message consumer

Please take a look at this document which may help you.  http://pulsar.apache.org/docs/en/concepts-messaging/#negative-acknowledgement

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133174

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] harissecic added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user harissecic added a comment to the discussion: Support for long running message consumer

> There is still no good solution for retrying long running jobs.

I think there's plenty of _good enough_ workarounds but I agree there should be one optimal for long running consumers out-of-the-box. Just to list a few:
1. Using message properties on consumers with `reconsumeLater` - not sure which version starts to support this feature but adding properties to the message like `processing=true` and later `isDone=true` would require just a little extra code to check these properties before even trying to consume the message. If done is set to true simply ack message and move to the next.
2. Using readers with similar approach where message metadata/properties are read. In some cases consumers are not needed and using reader is a bit more simpler but in others we do really want the consumer - so not really a workaround in context of this case.
3. Combining DLQ with `negAck` and later processing DLQ with extra custom code to check if something was done already. Putting max redelivery to 1 would make message automatically on the next retry going directly to DLQ after timeout. This of course would require local concurrent cache where you keep processing ID-s in runtime memory and check them on message arrivals so you can simply negAck message if it's still processing. This way after processing actual message consumer can trigger "removing" message from DLQ. This would support both ackTimout and manually handling timeouts.
4. Trying to cache everything in DB or such and looking for messageIds, started processing time, allowed timeouts, ... Upon receiving message check this list and determine whether the message is being processed still or failed and this was a consumer restart.

I assume some kind of 3 would be good to have out-of-the-box. Best of course would be to have something like LRQ (long running queue for the lack of creativity from my side) where upon retry of ackTimeout consumer has the option to send back the message to broker like 'still processing' and it moves message to this queue and have Pulsar track if TCP dies, push them back to normal queue and retry, if TCP is alive let consumer tell when this message should be removed. Using DLQ for this is also possible but confuses messages that where retired too much and the ones that consumer is aware take too long.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133185

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] shubham-Shole4ever added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user shubham-Shole4ever added a comment to the discussion: Support for long running message consumer

@sijie I can do with the workaround suggested by @codelipenghui and @merlimat for the time being. However, as mentioned, the solution will not work in case I also have a need of ackTimeout.
Would request the community to propose a feature to handle such cases for future.

Thanks @codelipenghui and @merlimat for all the help. :)

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133181

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] codelipenghui added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user codelipenghui added a comment to the discussion: Support for long running message consumer

@shubham-Shole4ever 
You can disable ack timeout, just use ack/negative ack. It means explicitly telling the broker that the process failed and then the broker redeliver this message, if message is in progress, no need to ack/negative ack.

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133177

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun added a comment to the discussion: Support for long running message consumer

I'm moving this discussion to the Discussions forum since it's an open-ended discussion instead of an actionable task :)

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133186

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] sijie added a comment to the discussion: Support for long running message consumer

Posted by GitBox <gi...@apache.org>.
GitHub user sijie added a comment to the discussion: Support for long running message consumer

@shubham-Shole4ever does Matteo's comment make sense to you?

GitHub link: https://github.com/apache/pulsar/discussions/18456#discussioncomment-4133180

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org