You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by begineer <re...@gmail.com> on 2017/05/02 08:48:25 UTC

Continuous Query remote listener misses some events or respond really late

Hi,
I am currently facing intermittent issue with continuous query. Cant really
reproduce it but if any one faced this issue, please do let me know
My application is deployed on 12 nodes with 5-6 services are used to detect
respective events using continuous query. 
Lets say I have a cache of type 
Cache<Long, Trade> where Trade is like this
class Trade{
int pkey,
String type
....
TradeState state;//enum
}
CQ detects the new entry to cache(with updated state) and checks if trade
has the state which matches its remote filter criteria.
A Trade moves from state1-state5. each CQ listens to one stage and do some
processing and move it to next state where next CQ will detect it and act
accordingly.
Problem is sometimes, trade get stuck in some state and does not move. I
have put logs in remote listener Predicate method(which checks the filter
criteria) but these logs don't get printed on console. Some times CQ detect
events after 4-5 hours.
I am using ignite 1.8.2
Does any one seen this behavior, I will be grateful for help extended



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Continuous Query remote listener misses some events or respond really late

Posted by begineer <re...@gmail.com>.
Hi..
I know its quite late to reply, But I am seeing this issue intermittently
almost everyday. But can't reproduce it locally on dev machine. As suggested
I have moved logs before null check to see if null event is logged. However,
I didn't see it printed in logs. Also, it was suggested to check if events
(in question) reaches remote listener(log should print), no log is printed
in such scenario so I assume event does not reach remote listener
immediately.

Same event is processed after several hours later. like 4 hours some times
even after one day. 

I tried to add same event manually to cache object, it is processed
immediately 
(only if original event is stuck).

Also, host logs are clean, I couldn't find anything suspicious. 
Please let me know if you want any more information. I will try to fetch it.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Continuous Query remote listener misses some events or respond really late

Posted by begineer <re...@gmail.com>.
Hi,
Thanks I will move the logging as suggested. And that is correct, we don't
store null in caches.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p13873.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Continuous Query remote listener misses some events or respond really late

Posted by Sasha Belyak <rt...@gmail.com>.
Thank for your reply. From code I see that you log only entries with non
null values. If your absolutely shure that you never put null in cache - I
will create loadtest to reproduce it and create issue for you. But it will
be great, if you move logging before event.getValue! = null.

среда, 7 июня 2017 г. пользователь begineer написал:

> Hi.. Sorry its quite late to reply. CQ is setup in execute method of
> service
> not in init(), but we do have initialQuery in CQ to scan existing events to
> matching the filter. Below is snapshot of one of the many ignite services
> set to process trade on when trade moves to particular status.
>
> As you can see, I have added logs to remote filter predicate. But these
> logs
> don't get printed when trade get stuck at particular status. So I assume,
> remote filter does not pick the events it is supposed to track.
>
> public enum TradeStatus {
>         NEW, CHANGED, EXPIRED, FAILED, UNCHANGED , SUCCESS
> }
>
>
> /**
>  * Ignite Service which picks up CHANGED trade delivery items
>  */
> public class ChangedTradeService implements Service{
>
>         @IgniteInstanceResource
>         private transient Ignite ignite;
>         private transient IgniteCache<Long, Trade> tradeCache;
>         private transient QueryCursor<Entry&lt;Long, Trade>> cursor;
>
>         @Override
>         public void init(ServiceContext serviceContext) throws Exception {
>                 tradeCache = ignite.cache("tradeCache");
>         }
>
>         @Override
>         public void execute(ServiceContext serviceContext) throws
> Exception {
>                 ContinuousQuery<Long, Trade> query = new
> ContinuousQuery<>();
>                 query.setLocalListener((CacheEntryUpdatedListenerAsync<Long,
> Trade>)
> events -> events
>                                 .forEach(event ->
> process(event.getValue())));
>                 query.setRemoteFilterFactory(
> factoryOf(checkStatus(status)));
>                 query.setInitialQuery(new ScanQuery<>(
> checkStatusPredicate(status)));
>                 QueryCursor<Cache.Entry&lt;Long, Trade>> cursor =
> tradeCache.query(query);
>                 cursor.forEach(entry -> process(entry.getValue()));
>         }
>
>         private void process(Trade item){
>              log.info("transition started for trade id :"+item.getPkey());
>                 //move the trade to next state(e.g SUCCESS) and next
> Service(contains CQ,
> which is looking for SUCCESS status) will pick this up for processing
> further and so on
>              log.info("transition finished for trade id
> :"+item.getPkey());
> }
>
>         @Override
>         public void cancel(ServiceContext serviceContext) {
>                 cursor.close();
>         }
>
>         static CacheEntryEventFilterAsync<Long, Trade>
> checkStatus(TradeStatus
> status) {
>                 return event -> event.getValue() != null &&
> checkStatusPredicate(status).apply(event.getKey(), event.getValue());
>         }
>
>         static IgniteBiPredicate<Long, TradeStatus>
> checkStatusPredicate(TradeStatus status) {
>                 return (k, v) -> {
>                         LOG.debug("Status checking for: {} Event value: {}
> isStatus: {}", status,
> v, v.getStatus() == status);
>                         return v.getStatus() == status;
>                 };
>         }
> }
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-
> or-respond-really-late-tp12338p13476.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Continuous Query remote listener misses some events or respond really late

Posted by begineer <re...@gmail.com>.
Hi.. Sorry its quite late to reply. CQ is setup in execute method of service
not in init(), but we do have initialQuery in CQ to scan existing events to
matching the filter. Below is snapshot of one of the many ignite services
set to process trade on when trade moves to particular status.

As you can see, I have added logs to remote filter predicate. But these logs
don't get printed when trade get stuck at particular status. So I assume,
remote filter does not pick the events it is supposed to track.

public enum TradeStatus { 
        NEW, CHANGED, EXPIRED, FAILED, UNCHANGED , SUCCESS 
}


/**
 * Ignite Service which picks up CHANGED trade delivery items
 */
public class ChangedTradeService implements Service{

	@IgniteInstanceResource
	private transient Ignite ignite;
	private transient IgniteCache<Long, Trade> tradeCache;
	private transient QueryCursor<Entry&lt;Long, Trade>> cursor;

	@Override
	public void init(ServiceContext serviceContext) throws Exception {
		tradeCache = ignite.cache("tradeCache");
	}

	@Override
	public void execute(ServiceContext serviceContext) throws Exception {
		ContinuousQuery<Long, Trade> query = new ContinuousQuery<>();
		query.setLocalListener((CacheEntryUpdatedListenerAsync<Long, Trade>)
events -> events
				.forEach(event -> process(event.getValue())));
		query.setRemoteFilterFactory(factoryOf(checkStatus(status)));
		query.setInitialQuery(new ScanQuery<>(checkStatusPredicate(status)));
		QueryCursor<Cache.Entry&lt;Long, Trade>> cursor = tradeCache.query(query);
		cursor.forEach(entry -> process(entry.getValue()));
	}

	private void process(Trade item){
             log.info("transition started for trade id :"+item.getPkey());
		//move the trade to next state(e.g SUCCESS) and next Service(contains CQ,
which is looking for SUCCESS status) will pick this up for processing
further and so on
             log.info("transition finished for trade id :"+item.getPkey());	
}

	@Override
	public void cancel(ServiceContext serviceContext) {
		cursor.close();
	}
	
	static CacheEntryEventFilterAsync<Long, Trade> checkStatus(TradeStatus
status) {
		return event -> event.getValue() != null &&
checkStatusPredicate(status).apply(event.getKey(), event.getValue());
	}
	
	static IgniteBiPredicate<Long, TradeStatus>
checkStatusPredicate(TradeStatus status) {
		return (k, v) -> {
			LOG.debug("Status checking for: {} Event value: {} isStatus: {}", status,
v, v.getStatus() == status);
			return v.getStatus() == status;
		};
	}
}




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p13476.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Continuous Query remote listener misses some events or respond really late

Posted by Sasha Belyak <rt...@gmail.com>.
As far as I understant you create CQ in Service.init, so node with running
service is CQ node. All other nodes from grid will send CQ events to this
node to process in your service and if you don't configure nodeFilter for
service - any node can run it, so any node can be CQ node.
But it shouldn't be a problem if you create CQ in Service.init() and
haven't too heavy load on you cluster (anyway if data owner node failed to
deliver messages to node with running service (CQ node) - you should see it
in logs). If you give some code examples  how you use CQ - I can say more.

2017-05-05 17:59 GMT+07:00 begineer <re...@gmail.com>:

> Thanks, In my application, all nodes are server nodes
> And how do we be sure that nodes removed/ reconnect to grid is CQ node, it
> can be any.
> Also, Is this issue possible in all below scenarios?
> 1. if node happens to be CQ node or any node?
> 2. node is removed from grid forcefully(manual shutdown)
> 3. node went down due to some reason and grid dropped it
>
> 3rd one looks like safe option since it is dropped by grid so grid should
> be
> ware where to shift the CQ? Please correct me if I am wrong.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-
> or-respond-really-late-tp12338p12454.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Continuous Query remote listener misses some events or respond really late

Posted by begineer <re...@gmail.com>.
Thanks, In my application, all nodes are server nodes
And how do we be sure that nodes removed/ reconnect to grid is CQ node, it
can be any. 
Also, Is this issue possible in all below scenarios?
1. if node happens to be CQ node or any node? 
2. node is removed from grid forcefully(manual shutdown)
3. node went down due to some reason and grid dropped it

3rd one looks like safe option since it is dropped by grid so grid should be
ware where to shift the CQ? Please correct me if I am wrong.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12454.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Continuous Query remote listener misses some events or respond really late

Posted by Sasha Belyak <rt...@gmail.com>.
If node with CQ leave grid (or just reconnect to grid, if it client node) -
you should recreate CQ, because some cache updates can happen when node
with CQ listener can't receive it. What happen it this case:
1) Node with changed cache entry process CQ, entry pass remote filter and
node try to send continues query event message to CQ node
2) If sender node can't push msg by any reasons (sender will retry few
times) - it can't wait receiver too long and drop it.
3) After CQ node return to the cluster - it must recreate CQ to process
initialQuery to get such events.
If you sure that no CQ owners node leaves grid - we need to continue,
becouse it can be bug.
And yes, I think that it is not evidently that you must recreate CQ after
client reconnect, but that is how ignite work now.

2017-05-05 16:56 GMT+07:00 begineer <re...@gmail.com>:

> Umm. actually nothing get logged in such scenario. However, as you
> indicated
> earlier, I could see trades get stuck if a node leaves the grid(not
> always).
> Do you know why that happens? Is that a bug?
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-
> or-respond-really-late-tp12338p12452.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Continuous Query remote listener misses some events or respond really late

Posted by begineer <re...@gmail.com>.
Umm. actually nothing get logged in such scenario. However, as you indicated
earlier, I could see trades get stuck if a node leaves the grid(not always).
Do you know why that happens? Is that a bug?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12452.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Continuous Query remote listener misses some events or respond really late

Posted by Sasha Belyak <rt...@gmail.com>.
Can you share you log files?

2017-05-03 19:05 GMT+07:00 begineer <re...@gmail.com>:

> 1) How you use ContinuousQuery: with initialQuery or without? : with
> initial
> query having same predicate
> 2) Did some nodes disconnect when you loose updates? no
> 3) Did you log entries in CQ.localListener? Just to be sure that error in
> CQ
> logic, not in your service logic. :
> ---- No log entries in remote filter, nor in locallistner
> 4) Can someone update old entries? Maybe they just get into CQ again after
> 4-5 hours by external update?
>    --- I tried adding same events just to trigger event again, some time it
> moves ahead(event discovered), some times get stuck at same state.
> Also, CQ detects them at its won after long time mentioned, we dont add any
> event in this case.
> Regards,
> Surinder
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-
> or-respond-really-late-tp12338p12387.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Continuous Query remote listener misses some events or respond really late

Posted by begineer <re...@gmail.com>.
1) How you use ContinuousQuery: with initialQuery or without? : with initial
query having same predicate
2) Did some nodes disconnect when you loose updates? no
3) Did you log entries in CQ.localListener? Just to be sure that error in CQ
logic, not in your service logic. :  
---- No log entries in remote filter, nor in locallistner
4) Can someone update old entries? Maybe they just get into CQ again after
4-5 hours by external update?
   --- I tried adding same events just to trigger event again, some time it
moves ahead(event discovered), some times get stuck at same state.
Also, CQ detects them at its won after long time mentioned, we dont add any
event in this case.
Regards,
Surinder



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12387.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Continuous Query remote listener misses some events or respond really late

Posted by Sasha Belyak <rt...@gmail.com>.
1) How you use ContinuousQuery: with initialQuery or without?
2) Did some nodes disconnect when you loose updates?
3) Did you log entries in CQ.localListener? Just to be sure that error in
CQ logic, not in your service logic.
4) Can someone update old entries? Maybe they just get into CQ again after
4-5 hours by external update?

2017-05-03 17:13 GMT+07:00 begineer <re...@gmail.com>:

> Hi Thanks for looking into this. Its not easily reproduce-able. I only see
> it
> some times. Here is my cache and service configuration
>
> Cache configuration:
>
> readThrough="true"
> writeThrough="true"
> writeBehindEnabled="true"
> writeBehindFlushThreadCount="5"
> backups="1"
> readFromBackup="true"
>
> service configuartion:
>
> maxPerNodeCount="1"
> totalCount="1"
>
> Cache is distributed over 12 nodes.
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-
> or-respond-really-late-tp12338p12382.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Continuous Query remote listener misses some events or respond really late

Posted by begineer <re...@gmail.com>.
Hi Thanks for looking into this. Its not easily reproduce-able. I only see it
some times. Here is my cache and service configuration

Cache configuration:

readThrough="true"
writeThrough="true"
writeBehindEnabled="true"
writeBehindFlushThreadCount="5"
backups="1"
readFromBackup="true"

service configuartion:

maxPerNodeCount="1" 
totalCount="1"

Cache is distributed over 12 nodes.





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-or-respond-really-late-tp12338p12382.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Continuous Query remote listener misses some events or respond really late

Posted by Sasha Belyak <rt...@gmail.com>.
Hi,
I'm trying to reproduce it in one host (with 6 ignite server node) but all
work fine for me. Can you share ignite configuration, cache configuration,
logs or some reproducer?

2017-05-02 15:48 GMT+07:00 begineer <re...@gmail.com>:

> Hi,
> I am currently facing intermittent issue with continuous query. Cant really
> reproduce it but if any one faced this issue, please do let me know
> My application is deployed on 12 nodes with 5-6 services are used to detect
> respective events using continuous query.
> Lets say I have a cache of type
> Cache<Long, Trade> where Trade is like this
> class Trade{
> int pkey,
> String type
> ....
> TradeState state;//enum
> }
> CQ detects the new entry to cache(with updated state) and checks if trade
> has the state which matches its remote filter criteria.
> A Trade moves from state1-state5. each CQ listens to one stage and do some
> processing and move it to next state where next CQ will detect it and act
> accordingly.
> Problem is sometimes, trade get stuck in some state and does not move. I
> have put logs in remote listener Predicate method(which checks the filter
> criteria) but these logs don't get printed on console. Some times CQ detect
> events after 4-5 hours.
> I am using ignite 1.8.2
> Does any one seen this behavior, I will be grateful for help extended
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Continuous-Query-remote-listener-misses-some-events-
> or-respond-really-late-tp12338.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>