You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexey Kuznetsov (JIRA)" <ji...@apache.org> on 2018/05/12 13:09:00 UTC
[jira] [Comment Edited] (IGNITE-5960) Ignite Continuous Query
(Queries 3):
CacheContinuousQueryConcurrentPartitionUpdateTest::testConcurrentUpdatesAndQueryStartAtomic
is flaky
[ https://issues.apache.org/jira/browse/IGNITE-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472110#comment-16472110 ]
Alexey Kuznetsov edited comment on IGNITE-5960 at 5/12/18 1:08 PM:
-------------------------------------------------------------------
[~agoncharuk]
> Note that it would be cleaner to acquire the listeners map after the partition update counter is incremented, however, the listeners map is used in the needVal flag evaluation.
Yeah, I tried to acquire the listeners map *after* the partition update counter is incremented:
To do so, we firstly need to move _needVal_ and _readFromStore_ flags evaluation after update counter is incremented.
But _readFromStore_ flag must be evaluated strongly *before* partition counter is incremented, see _GridCacheMapEntry.AtomicCacheUpdateClosure#call_ where we load data from store.
So, we cannot acquire the listeners map *after* the partition update counter is incremented.
> Currently it is possible that {{evtOldVal}} will be {{null}} if we read {{null}} from {{updateListeners}} in the first place.
In my fix, query entry must be filtered out if we read {{null}} from {{updateListeners}} in the first place(this fixes the essential bug).
To filter out entry, _newVal_ and _oldVal_ must be passed as nulls to _cctx.continuousQueries().onEntryUpdated_ ,
see _CacheContinuousQueryManager#onEntryUpdated_ , _CacheContinuousQueryManager#skipUpdateEvent_ .
Let's consider again the scenario:
1) T1 updates E1 (updateCounter gets incremented to 1);
2) T2 updates E2 (updateCounter gets incremented to 2);
3) T2 finishes update and exits GridCacheMapEntry::innerUpdate method; In this point we have CacheContinuousQueryEventBuffer#currentPartitionCounter equals 2
4) user adds continuous query listener;
5) T1 proceeds, picks up listener (not null thanks to the fix) and notifies listener; Batch#startCntr equals 2 (currentPartitionCounter() returns 2) so
entry E1 would be filtered out of Batch (E1 update counter would be smaller than Batch#startCntr)
PS E1 also can be sent to remote node(if we have remote listener installed) and processed in CacheContinuousQueryPartitionRecovery#collectEntries,
but it would be filtered out there.
6) T3 updates E3 (updateCounter gets incremented to 3) and notifies listener;
After 6) user's listener would be notified only once by key 3.
After listener is set by user, entry E1 must be filtered out.
Are you agree with such changes ?
was (Author: alexey kuznetsov):
[~agoncharuk]
> Note that it would be cleaner to acquire the listeners map after the partition update counter is incremented, however, the listeners map is used in the needVal flag evaluation.
Yeah, I tried to acquire the listeners map *after* the partition update counter is incremented:
To do so, we firstly need to move _needVal_ and _readFromStore_ flags evaluation after update counter is incremented.
But _readFromStore_ flag must be evaluated strongly *before* partition counter is incremented, see _GridCacheMapEntry.AtomicCacheUpdateClosure#call_ where we load data from store.
So, we cannot acquire the listeners map *after* the partition update counter is incremented.
> Currently it is possible that {{evtOldVal}} will be {{null}} if we read {{null}} from {{updateListeners}} in the first place.
In my fix, query entry must be filtered out if we read {{null}} from {{updateListeners}} in the first place(this fixes the essential bug).
To filter out entry, _newVal_ and _oldVal_ must be passed as nulls to _cctx.continuousQueries().onEntryUpdated_ ,
see _CacheContinuousQueryManager#onEntryUpdated_ , _CacheContinuousQueryManager#skipUpdateEvent_ .
Let's consider again the scenario:
1) T1 updates E1 (updateCounter gets incremented to 1);
2) T2 updates E2 (updateCounter gets incremented to 2);
3) T2 finishes update and exits GridCacheMapEntry::innerUpdate method; In this point we have CacheContinuousQueryEventBuffer#currentPartitionCounter equals 2
4) user adds continuous query listener;
5) T1 proceeds, picks up listener (not null thanks to the fix) and notifies listener; Batch#startCntr equals 2 (currentPartitionCounter() returns 2) so
entry E1 would be filtered out of Batch (E1 update counter would be smaller than Batch#startCntr)
PS E1 also can be sent to remote node(if we have remote listener installed) and processed in CacheContinuousQueryPartitionRecovery#collectEntries,
but it would be filtered out there.
6) T3 updates E3 (updateCounter gets incremented to 3) and notifies listener;
After 6) user's listener would be notified only once by key 3.
After listener is set by user, entry E1 must be filtered out.
Are you agree with such changes ?
> Ignite Continuous Query (Queries 3): CacheContinuousQueryConcurrentPartitionUpdateTest::testConcurrentUpdatesAndQueryStartAtomic is flaky
> -----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-5960
> URL: https://issues.apache.org/jira/browse/IGNITE-5960
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.1
> Reporter: Sergey Chugunov
> Assignee: Alexey Kuznetsov
> Priority: Major
> Labels: MakeTeamcityGreenAgain, test-failure
> Fix For: 2.6
>
>
> According to [TC history|http://ci.ignite.apache.org/project.html?projectId=Ignite20Tests&testNameId=6546112007182082024&tab=testDetails&branch_Ignite20Tests=%3Cdefault%3E] test is flaky.
> It is possible to reproduce it locally, sample run shows 9 failed tests out of 30 overall executed.
> Test fails with jUnit assertion check:
> {noformat}
> junit.framework.AssertionFailedError:
> Expected :10000
> Actual :0
> <Click to see difference>
> at junit.framework.Assert.fail(Assert.java:57)
> at junit.framework.Assert.failNotEquals(Assert.java:329)
> at junit.framework.Assert.assertEquals(Assert.java:78)
> at junit.framework.Assert.assertEquals(Assert.java:234)
> at junit.framework.Assert.assertEquals(Assert.java:241)
> at junit.framework.TestCase.assertEquals(TestCase.java:409)
> at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryConcurrentPartitionUpdateTest.concurrentUpdatesAndQueryStart(CacheContinuousQueryConcurrentPartitionUpdateTest.java:385)
> at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryConcurrentPartitionUpdateTest.testConcurrentUpdatesAndQueryStartTx(CacheContinuousQueryConcurrentPartitionUpdateTest.java:245)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at junit.framework.TestCase.runTest(TestCase.java:176)
> at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2000)
> at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:132)
> at org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:1915)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)