You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Dominik Benz <do...@gmail.com> on 2017/02/24 14:08:14 UTC

session recover behaviour in nifi-jms-processor

Hi,

we're currently using Nifi to consume a relatively high-traffic JMS topic
(40-60 messages per second). 

Worked well in principle - however, we then noticed that the the outbound
rate (i.e. the number of messages we fetched) of the topic was consistently
slightly higher than its inbound rate (i.e. the actual number of messages
sent to the topic). This puzzled me, because (being the only subscriber to
the topic) I would expect inbound and outbound traffic to be identical
(given we can consume fast enough, which we can).

Digging deeper, I found that in 

  org.apache.nifi.jms.processors.JMSConsumer

the method "consume" performs a session.recover:



session.recover (as written in the comment) basically stops message delivery
and re-starts from the last non-acked message. However, I think this leads
to the following issue in high-traffic contexts:

1) several threads perform the JMS session callback in parallel
2) each callback performs a session.recover
3) during high traffic, the situation arises that the ACKs from another
thread may not (yet) have arrived at the JMS server
4) this implies that the pointer of session.recover will reconsume the
not-yet-acked message from another thread

For verification, I performed so far the following steps:

(a) manual implementation of a simplistic synchronous JMS topic consumer ->
inbound/outbound identical as expected
(b) patched nifi-jms-processors and commented out session.recover() -
inbout/outbound identical as expected

Any thoughts on this? My current impression is that session.recover in its
current usage doesn't play well together with the multi-threaded JMS
consumers. Or do I have any misconception? 

Thanks & best regards,
  Dominik



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/session-recover-behaviour-in-nifi-jms-processor-tp14940.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: session recover behaviour in nifi-jms-processor

Posted by Dominik Benz <do...@gmail.com>.
Just created this ticket:

https://issues.apache.org/jira/browse/NIFI-3531

I'm happy to help in any way!



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/session-recover-behaviour-in-nifi-jms-processor-tp14940p14977.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: session recover behaviour in nifi-jms-processor

Posted by Dominik Benz <do...@gmail.com>.
Just one quick addition: I understood that Nifi prefers message duplication
over message loss (which we are fine with in principal, because we can
de-duplicate later). However, in our current situation, we have to stop the
connection after a certain time interval (mostly roughly 90minutes) due to
the following behaviour:

1) session.recover within nifi-jms-processor leads to re-delivery of
not-yet-acked messages (as described above)
2) over time, the number of re-delivered message grows, and is "piling up"
3) starting from a certain time, the "pile" of re-delivered messages has
grown from 10 or 20 to hundreds
4) in this situation, the consumers "fall behind" too much and cannot cope
with the newly arriving messages (in addition to the many not-yet-acked old
ones)
5) the JMS topic starts to have a growing number of pending messages for the
aforementioned reason
6) the JMS server side gets storage problems due to buffering the pending
messages
7) we have to disconnect to avoid JMS server crashes

This is just for clarification that the current implementation doesn't only
lead to "some" duplicates (which would be fine), but rather to more severe
problems on our side.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/session-recover-behaviour-in-nifi-jms-processor-tp14940p14976.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: session recover behaviour in nifi-jms-processor

Posted by Dominik Benz <do...@gmail.com>.
Hi Oleg,

thanks for the very fast & very helpful reply! I'll write a JIRA, perfect!

Best,
  Dominik



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/session-recover-behaviour-in-nifi-jms-processor-tp14940p14944.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: session recover behaviour in nifi-jms-processor

Posted by Oleg Zhurakousky <oz...@hortonworks.com>.
Sorry, just noticed the typo. Instead of “limit the possibility of a message loads. . .” should be "limit the possibility of a message loss…"

Cheers
Oleg
> On Feb 24, 2017, at 9:28 AM, Oleg Zhurakousky <oz...@hortonworks.com> wrote:
> 
> Dominic
> 
> Great to hear from you!
> As you can see from the inline comment in the code, the recover is there for a reason primarily to ensure or should I say limit the possibility of a message loads in the event of a processor and/or NiFi crash. 
> As you may be aware, in NiFi we do prefer message duplication over message loss. That said, I do see several possibilities for improvement especially for the high traffic scenarios you are describing. One such improvement would be to create a listening container version of ConsumeJMS which has far more control over threading and session caching/sharing.
> 
> Would you mind racing a JIRA issue - https://issues.apache.org/jira/browse/NIFI/ describing everything you just did and we’ll handle it.
> 
> Cheers
> Oleg
> 
>> On Feb 24, 2017, at 9:08 AM, Dominik Benz <do...@gmail.com> wrote:
>> 
>> Hi,
>> 
>> we're currently using Nifi to consume a relatively high-traffic JMS topic
>> (40-60 messages per second). 
>> 
>> Worked well in principle - however, we then noticed that the the outbound
>> rate (i.e. the number of messages we fetched) of the topic was consistently
>> slightly higher than its inbound rate (i.e. the actual number of messages
>> sent to the topic). This puzzled me, because (being the only subscriber to
>> the topic) I would expect inbound and outbound traffic to be identical
>> (given we can consume fast enough, which we can).
>> 
>> Digging deeper, I found that in 
>> 
>> org.apache.nifi.jms.processors.JMSConsumer
>> 
>> the method "consume" performs a session.recover:
>> 
>> 
>> 
>> session.recover (as written in the comment) basically stops message delivery
>> and re-starts from the last non-acked message. However, I think this leads
>> to the following issue in high-traffic contexts:
>> 
>> 1) several threads perform the JMS session callback in parallel
>> 2) each callback performs a session.recover
>> 3) during high traffic, the situation arises that the ACKs from another
>> thread may not (yet) have arrived at the JMS server
>> 4) this implies that the pointer of session.recover will reconsume the
>> not-yet-acked message from another thread
>> 
>> For verification, I performed so far the following steps:
>> 
>> (a) manual implementation of a simplistic synchronous JMS topic consumer ->
>> inbound/outbound identical as expected
>> (b) patched nifi-jms-processors and commented out session.recover() -
>> inbout/outbound identical as expected
>> 
>> Any thoughts on this? My current impression is that session.recover in its
>> current usage doesn't play well together with the multi-threaded JMS
>> consumers. Or do I have any misconception? 
>> 
>> Thanks & best regards,
>> Dominik
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/session-recover-behaviour-in-nifi-jms-processor-tp14940.html
>> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
>> 
> 


Re: session recover behaviour in nifi-jms-processor

Posted by Oleg Zhurakousky <oz...@hortonworks.com>.
Dominic

Great to hear from you!
As you can see from the inline comment in the code, the recover is there for a reason primarily to ensure or should I say limit the possibility of a message loads in the event of a processor and/or NiFi crash. 
As you may be aware, in NiFi we do prefer message duplication over message loss. That said, I do see several possibilities for improvement especially for the high traffic scenarios you are describing. One such improvement would be to create a listening container version of ConsumeJMS which has far more control over threading and session caching/sharing.

Would you mind racing a JIRA issue - https://issues.apache.org/jira/browse/NIFI/ describing everything you just did and we’ll handle it.

Cheers
Oleg

> On Feb 24, 2017, at 9:08 AM, Dominik Benz <do...@gmail.com> wrote:
> 
> Hi,
> 
> we're currently using Nifi to consume a relatively high-traffic JMS topic
> (40-60 messages per second). 
> 
> Worked well in principle - however, we then noticed that the the outbound
> rate (i.e. the number of messages we fetched) of the topic was consistently
> slightly higher than its inbound rate (i.e. the actual number of messages
> sent to the topic). This puzzled me, because (being the only subscriber to
> the topic) I would expect inbound and outbound traffic to be identical
> (given we can consume fast enough, which we can).
> 
> Digging deeper, I found that in 
> 
>  org.apache.nifi.jms.processors.JMSConsumer
> 
> the method "consume" performs a session.recover:
> 
> 
> 
> session.recover (as written in the comment) basically stops message delivery
> and re-starts from the last non-acked message. However, I think this leads
> to the following issue in high-traffic contexts:
> 
> 1) several threads perform the JMS session callback in parallel
> 2) each callback performs a session.recover
> 3) during high traffic, the situation arises that the ACKs from another
> thread may not (yet) have arrived at the JMS server
> 4) this implies that the pointer of session.recover will reconsume the
> not-yet-acked message from another thread
> 
> For verification, I performed so far the following steps:
> 
> (a) manual implementation of a simplistic synchronous JMS topic consumer ->
> inbound/outbound identical as expected
> (b) patched nifi-jms-processors and commented out session.recover() -
> inbout/outbound identical as expected
> 
> Any thoughts on this? My current impression is that session.recover in its
> current usage doesn't play well together with the multi-threaded JMS
> consumers. Or do I have any misconception? 
> 
> Thanks & best regards,
>  Dominik
> 
> 
> 
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/session-recover-behaviour-in-nifi-jms-processor-tp14940.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
>