You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ignite TC Bot (JIRA)" <ji...@apache.org> on 2019/03/01 10:23:00 UTC

[jira] [Commented] (IGNITE-10078) Node failure during concurrent partition updates may cause partition desync between primary and backup.

    [ https://issues.apache.org/jira/browse/IGNITE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781545#comment-16781545 ] 

Ignite TC Bot commented on IGNITE-10078:
----------------------------------------

{panel:title=--&gt; Run :: All: Possible Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform .NET{color} [[tests 8|https://ci.ignite.apache.org/viewLog.html?buildId=3204681]]
* exe: CacheLocalTest.TestTransactionScopeMultiCache(True) - 0,0% fails in last 605 master runs.
* exe: CacheLocalTest.TestTransactionScopeMultiCache(False) - 0,0% fails in last 605 master runs.
* exe: CacheLocalTest.TestTxAllModes(True) - 0,0% fails in last 605 master runs.
* exe: CacheLocalTest.TestTxAllModes(False) - 0,0% fails in last 605 master runs.
* exe: CacheLocalTest.TestTxCommit(True) - 0,0% fails in last 605 master runs.
* exe: CacheLocalTest.TestTxCommit(False) - 0,0% fails in last 605 master runs.

{color:#d04437}MVCC PDS 2{color} [[tests 2|https://ci.ignite.apache.org/viewLog.html?buildId=3204707]]
* IgnitePdsMvccTestSuite2: IgnitePdsCorruptedStoreTest.testCheckpointFailure - 0,0% fails in last 440 master runs.

{color:#d04437}PDS (Indexing){color} [[tests 2|https://ci.ignite.apache.org/viewLog.html?buildId=3208823]]
* IgnitePdsWithIndexingCoreTestSuite: IgniteLogicalRecoveryTest.testRecoveryOnJoinToInactiveCluster - 0,0% fails in last 442 master runs.
* IgnitePdsWithIndexingCoreTestSuite: IgniteLogicalRecoveryTest.testRecoveryOnJoinToActiveCluster - 0,0% fails in last 442 master runs.

{color:#d04437}Service Grid{color} [[tests 2|https://ci.ignite.apache.org/viewLog.html?buildId=3204695]]
* IgniteServiceGridTestSuite: ServiceDeploymentOnClientDisconnectTest.testThrowingExceptionOnUndeployUsingInternalApiWhileClientDisconnectedTest - 0,0% fails in last 652 master runs.

{color:#d04437}Continuous Query 2{color} [[tests 10|https://ci.ignite.apache.org/viewLog.html?buildId=3204606]]
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testStartStopQuery - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryFailoverMvccTxReplicatedSelfTest.testNoEventLossOnTopologyChange - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testNoEventLossOnTopologyChange - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryFailoverMvccTxSelfTest.testUpdatePartitionCounter - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryFailoverMvccTxSelfTest.testStartStopQuery - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryFailoverMvccTxReplicatedSelfTest.testUpdatePartitionCounter - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryFailoverMvccTxReplicatedSelfTest.testStartStopQuery - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testUpdatePartitionCounter - 0,0% fails in last 432 master runs.
* IgniteCacheQuerySelfTestSuite4: CacheContinuousQueryFailoverMvccTxSelfTest.testNoEventLossOnTopologyChange - 0,0% fails in last 432 master runs.

{color:#d04437}MVCC Cache 7{color} [[tests 2|https://ci.ignite.apache.org/viewLog.html?buildId=3204703]]
* IgniteCacheMvccTestSuite7: IgniteCacheStartWithLoadTest.testNoRebalanceDuringCacheStart - 0,0% fails in last 441 master runs.

{color:#d04437}Continuous Query 4{color} [[tests 13|https://ci.ignite.apache.org/viewLog.html?buildId=3204670]]
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterMvcc - 0,2% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterMvccTxSyncFilter - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterMvccTx - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterReplicatedSyncFilterMvcc - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInListenerReplicatedJCacheApiMvcc - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterReplicatedJCacheApiMvcc - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInListenerMvcc - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterMvccTxJCacheApi - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInListenerMvccTx - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterSyncFilterMvcc - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInListenerMvccTxJCacheApi - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInFilterReplicatedMvcc - 0,0% fails in last 436 master runs.
* IgniteCacheQuerySelfTestSuite6: CacheContinuousQueryAsyncFilterListenerTest.testNonDeadLockInListenerReplicatedMvcc - 0,0% fails in last 436 master runs.

{color:#d04437}MVCC Queries{color} [[tests 0 TIMEOUT , Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=3204644]]
* CacheMvccPartitionedSqlCoordinatorFailoverTest.testUpdate_N_Objects_ClientServer_Backups1_Sql_Persistence (last started)

{color:#d04437}MVCC Cache 5{color} [[tests 0 TIMEOUT , Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=3204701]]

{color:#d04437}_Javadoc_{color} [[tests 0 BuildFailureOnMessage |https://ci.ignite.apache.org/viewLog.html?buildId=3204646]]

{panel}
[TeamCity *--&gt; Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=3204711&amp;buildTypeId=IgniteTests24Java8_RunAll]

> Node failure during concurrent partition updates may cause partition desync between primary and backup.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-10078
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10078
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexei Scherbakov
>            Assignee: Alexei Scherbakov
>            Priority: Major
>             Fix For: 2.8
>
>
> This is possible if some updates are not written to WAL before node failure. They will be not applied by rebalancing due to same partition counters in certain scenario:
> 1. Start grid with 3 nodes, 2 backups.
> 2. Preload some data to partition P.
> 3. Start two concurrent transactions writing single key to the same partition P, keys are different
> {noformat}
> try(Transaction tx = client.transactions().txStart(PESSIMISTIC, REPEATABLE_READ, 0, 1)) {
>       client.cache(DEFAULT_CACHE_NAME).put(k, v);
>       tx.commit();
> }
> {noformat}
> 4. Order updates on backup in the way such update with max partition counter is written to WAL and update with lesser partition counter failed due to triggering of FH before it's added to WAL
> 5. Return failed node to grid, observe no rebalancing due to same partition counters.
> Possible solution: detect gaps in update counters on recovery and force rebalance from a node without gaps if detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)