You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Aleksey Plekhanov (Jira)" <ji...@apache.org> on 2023/04/24 14:54:00 UTC

[jira] [Updated] (IGNITE-18935) Late stopping of TTL workers during deactivation leads to corrupted PDS

     [ https://issues.apache.org/jira/browse/IGNITE-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aleksey Plekhanov updated IGNITE-18935:
---------------------------------------
    Release Note: Fixed PDS corruption on deactivation during entries expire

> Late stopping of TTL workers during deactivation leads to corrupted PDS
> -----------------------------------------------------------------------
>
>                 Key: IGNITE-18935
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18935
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.14
>            Reporter: Ivan Daschinsky
>            Assignee: Ivan Daschinsky
>            Priority: Major
>              Labels: ise
>             Fix For: 2.15
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Step to reproduce
> 1. Reduce wal history size and wal segment size to 16MB and 8MB respectively, set checkpoint frequency to 10000
> 2. Perform heavy load with a lot of entries with TTL 5000 and with eager ttl enabled
> 3. Perform deactivation of cluster, stop grid and restart, provided that an expiration process is active during the process of restart.
> {code}
> [15:11:58,022][SEVERE][ttl-cleanup-worker-#52%None%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [groupId=-459905951, pageIds=[], msg=Runtime failure on bounds: [lower=null, upper=PendingRow []]]]]
> class org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [groupId=-459905951, pageIds=[], msg=Runtime failure on bounds: [lower=null, upper=PendingRow []]]
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:6434)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1294)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1249)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1237)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1232)
> 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:3061)
> 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:3010)
> 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:1213)
> 	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:246)
> 	at org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.lambda$body$0(GridCacheSharedTtlCleanupManager.java:199)
> 	at java.util.concurrent.ConcurrentHashMap.computeIfPresent(ConcurrentHashMap.java:1769)
> 	at org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:198)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
> 	at java.lang.Thread.run(Thread.java:750)
> Caused by: org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException: org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException: java.lang.IllegalStateException: Item not found: 24
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1216)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1276)
> 	... 12 more
> Caused by: org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException: java.lang.IllegalStateException: Item not found: 24
> 	at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.doInitFromLink(CacheDataRowAdapter.java:345)
> 	at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:165)
> 	at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:136)
> 	at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:123)
> 	at org.apache.ignite.internal.processors.cache.tree.PendingRow.initKey(PendingRow.java:73)
> 	at org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:128)
> 	at org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:32)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer0(BPlusTree.java:6115)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.fillFromBuffer(BPlusTree.java:5864)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.init(BPlusTree.java:5790)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1205)
> 	... 13 more
> Caused by: java.lang.IllegalStateException: Item not found: 24
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.findIndirectItemIndex(AbstractDataPageIO.java:488)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.getDataOffset(AbstractDataPageIO.java:596)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.readPayload(AbstractDataPageIO.java:638)
> 	at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.readIncomplete(CacheDataRowAdapter.java:380)
> 	at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.doInitFromLink(CacheDataRowAdapter.java:316)
> 	... 23 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)