You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Andrey Gura (JIRA)" <ji...@apache.org> on 2018/05/24 13:19:00 UTC

[jira] [Assigned] (IGNITE-8563) WAL file archiver does not propagate file archiving error to error handler

     [ https://issues.apache.org/jira/browse/IGNITE-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrey Gura reassigned IGNITE-8563:
-----------------------------------

    Assignee: Andrey Gura

> WAL file archiver does not propagate file archiving error to error handler
> --------------------------------------------------------------------------
>
>                 Key: IGNITE-8563
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8563
>             Project: Ignite
>          Issue Type: Improvement
>    Affects Versions: 2.5
>            Reporter: Alexey Goncharuk
>            Assignee: Andrey Gura
>            Priority: Major
>             Fix For: 2.6
>
>
> I observed this error when a disk with WAL archive left out of space:
> {code}
> ...
>         at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:464)
>         at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:517)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:940)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:819)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:775)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:97)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:189)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:187)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1054)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
>         at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
>         at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:511)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.ignite.IgniteException: Runtime failure on row: Row@1ec13b23[ key: 4458000681143704309, val: <CUT>
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2119)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putx(BPlusTree.java:2066)
>         at org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:247)
>         at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.addToIndex(GridH2Table.java:548)
>         at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:480)
>         at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:659)
>         at org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1866)
>         at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:403)
>         at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:1393)
>         at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1257)
>         at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1511)
>         at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:352)
>         at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3602)
>         at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3578)
>         at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1040)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:652)
>         ... 22 common frames omitted
> Caused by: org.apache.ignite.IgniteException: Failed to archive WAL segment [srcFile=/gridgain/ssd/storage/wal/consistentID_47500/0000000000000000.wal, dstFile=/gridgain/ssd/storage/wal_archive/consistentID_47500/0000000000004290.wal.tmp]
>         at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1615)
>         at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1478)
>         at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:449)
>         at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:443)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:377)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:352)
>         at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:274)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11100(BPlusTree.java:83)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryInsert(BPlusTree.java:2954)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.access$7600(BPlusTree.java:2642)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2367)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2348)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2348)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2348)
>         at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2086)
>         ... 37 common frames omitted
> Caused by: org.apache.ignite.IgniteCheckedException: Failed to archive WAL segment [srcFile=/gridgain/ssd/storage/wal/consistentID_47500/0000000000000000.wal, dstFile=/gridgain/ssd/storage/wal_archive/consistentID_47500/0000000000004290.wal.tmp]
>         at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.archiveSegment(FileWriteAheadLogManager.java:1777)
>         at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.run(FileWriteAheadLogManager.java:1582)
> Caused by: java.nio.file.FileSystemException: /gridgain/ssd/storage/wal/consistentID_47500/0000000000000000.wal -> /gridgain/ssd/storage/wal_archive/consistentID_47500/0000000000004290.wal.tmp: No space left on device
>         at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
>         at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>         at sun.nio.fs.UnixCopyFile.copyFile(UnixCopyFile.java:253)
>         at sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:581)
>         at sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253)
>         at java.nio.file.Files.copy(Files.java:1274)
>         at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.archiveSegment(FileWriteAheadLogManager.java:1764)
>         ... 1 common frames omitted
> {code}
> This resulted in TX heuristic error, which in turn resulted in hanging transaction. The reason is that WAL archiver try-catches the attempt to archive the file and later re-throws it in {{log(...)}} method (see cleanErr field usages). 
> I think instead we should get rid of the try-catch block altogether and route the exception directly to the error handler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)