You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Anton Vinogradov (Jira)" <ji...@apache.org> on 2022/10/07 08:14:00 UTC
[jira] [Updated] (IGNITE-17496) LWM may be after HWM (reserved) on primary after the node restart
[ https://issues.apache.org/jira/browse/IGNITE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anton Vinogradov updated IGNITE-17496:
--------------------------------------
Parent Issue: IGNITE-17845 (was: IGNITE-15167)
> LWM may be after HWM (reserved) on primary after the node restart
> -----------------------------------------------------------------
>
> Key: IGNITE-17496
> URL: https://issues.apache.org/jira/browse/IGNITE-17496
> Project: Ignite
> Issue Type: Sub-task
> Reporter: Anton Vinogradov
> Assignee: Anton Vinogradov
> Priority: Major
> Labels: iep-31
> Fix For: 2.14
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> {code:java}
> java.lang.AssertionError: LWM after HWM: lwm=10010, hwm=10003, cntr=Counter [lwm=10010, missed=[10011 - 10012, 10021, 10031 - 10032, 10043 - 10044], maxApplied=10047, hwm=10004]
> at org.apache.ignite.internal.processors.cache.PartitionUpdateCounterTrackingImpl.reserve(PartitionUpdateCounterTrackingImpl.java:265)
> at org.apache.ignite.internal.processors.cache.PartitionUpdateCounterErrorWrapper.reserve(PartitionUpdateCounterErrorWrapper.java:58)
> at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.getAndIncrementUpdateCounter(IgniteCacheOffheapManagerImpl.java:1620)
> at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.getAndIncrementUpdateCounter(GridCacheOffheapManager.java:2538)
> at org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtLocalPartition.getAndIncrementUpdateCounter(GridDhtLocalPartition.java:942)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.calculatePartitionUpdateCounters(IgniteTxLocalAdapter.java:510)
> at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare0(GridDhtTxPrepareFuture.java:1360)
> at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.mapIfLocked(GridDhtTxPrepareFuture.java:730)
> at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare(GridDhtTxPrepareFuture.java:1136)
> at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.prepareAsync(GridDhtTxLocal.java:400)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.prepareNearTx(IgniteTxHandler.java:581)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.prepareNearTx(IgniteTxHandler.java:378)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxPrepareRequest0(IgniteTxHandler.java:201)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxPrepareRequest(IgniteTxHandler.java:175)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$000(IgniteTxHandler.java:135)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$1.apply(IgniteTxHandler.java:223)
> at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$1.apply(IgniteTxHandler.java:221)
> at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151)
> at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592)
> at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
> at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
> at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110)
> at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309)
> at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1907)
> at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1528)
> at org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:243)
> at org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1421)
> at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
> at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:637)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
> at java.lang.Thread.run(Thread.java:750) {code}
> It looks like we have incorrect initialization problem here.
> For example, at startup primary has the following counter
> {code:java}
> [lwm=10006, missed=[10007 - 10008, 10017 - 10020, 10031 - 10033, 10039 - 10042, 10055], hwm=10059, reserved=10006]{code}
> but when updates started we'll got an exception
> {code:java}
> LWM after reserved: lwm=10016, reserved=10008, cntr=Counter [lwm=10016, missed=[10017 - 10020, 10031 - 10033, 10039 - 10042, 10055], hwm=10059, reserved=10009]{code}
> this happens because first gap was closed and {{lwm}} changed from {{10006}} to {{10016}} because of closed {{{}0007 - 10008{}}}.
> And main prodlem here that we're trying to reuse already used counters, so, {{reserved}} should be set to {{hwm}} at initialization.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)