You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Zhenya Stanilovsky <ar...@mail.ru> on 2022/06/07 07:19:04 UTC

Re[2]: gridgain ultimate edition snapshot error

hi, u need to change limits [1]
 
[1]  https://www.gridgain.com/docs/latest/perf-troubleshooting-guide/general-perf-tips#ulimits
  
>Вторник, 7 июня 2022, 8:35 +03:00 от Surinder Mehra <re...@gmail.com>:
> 
>Hi,
>I was going through this post on stackoverflow which is about the same issue. The fact that snapshot works for apache ignite bit not in ultimate edition indicates there is some bug in later. Could you please confirm. We have around 15 caches with 2 backups. I changed backups to zero but still see this issue. Could you please advise further.
>
>https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster  
>On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra < rednirus@gmail.com > wrote:
>>Hi,
>>I was experimenting with the GG ultimate edition to take snapshots and encountered the below error and cluster stops. Please note that this works in the ignite free version and we don't see too many files open error. Is this a bug or we are missing some configuration?
>> 
>>version:  gridgain-8.8.19
>>
>>/bin./snapshot-utility.sh snapshot -type=full
>>
>>[21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin]
>>class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin
>>at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
>>at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
>>at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
>>at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
>>at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
>>at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
>>at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
>>at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
>>at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
>>at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
>>at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
>>at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286)
>>at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567)
>>at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>>at java.base/java.lang.Thread.run(Thread.java:829)
>>Caused by: java.nio.file.FileSystemException: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin: Too many open files
>>at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
>>at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
>>at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
>>at java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201)
>>at java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253)
>>at java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311)
>>at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.<init>(AsyncFileIO.java:65)
>>at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
>>at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491)
>>... 14 more
>>[21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] No deadlocked threads detected.
>>[21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] Thread dump at 2022/06/06 21:03:51 IST
>>Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169]
>>    Lock [object=java.util.concurrent.CountDownLatch$Sync@5b60e356, ownerName=null, ownerId=-1]
>>        at  java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native Method)
>>        at  java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>>        at  java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
>>        at  java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039)
>>        at  java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)
>>        at  java.base@11.0.14.1/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232)
>>        at app//o.a.i.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:391)
>>
>>
>>Config:
>>
>>  <bean class="org.apache.ignite.configuration.IgniteConfiguration">
>>
>>    <property name="peerClassLoadingEnabled" value="true"/>
>>    <property name="deploymentMode" value="CONTINUOUS"/>
>>    <property name="dataStorageConfiguration">
>>      <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
>>        <property name="defaultDataRegionConfiguration">
>>          <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
>>            <property name="persistenceEnabled" value="true"/>
>>          </bean>
>>        </property>
>>      </bean>
>>    </property>
>>    <property name="pluginConfigurations">
>>      <bean class="org.gridgain.grid.configuration.GridGainConfiguration">
>>        <property name="snapshotConfiguration">
>>          <bean class="org.gridgain.grid.configuration.SnapshotConfiguration">
>>            <property name="snapshotsPath" value="/home/ignitesnapshots/"/>
>>          </bean>
>>        </property>
>>      </bean>
>>    </property>
>>  </bean>
>></beans>