You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by K F <kf...@yahoo.com> on 2015/08/05 23:28:07 UTC
Seeing wierd exceptions

Hi,
In 2 of our clusters running 2.0.14 version we are seeing following exception,
15-08-05 16:11:32,723 [FlushWriter:268358] ERRORCassandraDaemon Exception in thread Thread[FlushWriter:268358,5,main]FSReadError in/opt/cassandra/data/ks1/cf1/ks1-cf1-jb-231142-Index.db

        atorg.apache.cassandra.io.util.MmappedSegmentedFile$Builder.createSegments(MmappedSegmentedFile.java:200)

        atorg.apache.cassandra.io.util.MmappedSegmentedFile$Builder.complete(MmappedSegmentedFile.java:168)

        atorg.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:345)

        atorg.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:335)

        atorg.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:401)

        atorg.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:349)

        atorg.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)

        atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        atjava.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: Map failed

        atsun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:919)

        atorg.apache.cassandra.io.util.MmappedSegmentedFile$Builder.createSegments(MmappedSegmentedFile.java:192)

        ... 9 more

Caused by: java.lang.OutOfMemoryError: Map failed

        atsun.nio.ch.FileChannelImpl.map0(Native Method)

        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:916)

        ... 10 more


I understand that there is not enough system resources to map the file into memory. But this issue wasn't seen when we were running 2.0.9. 
Apart from this I noticed that while running repairs if any node goes down in the cluster, the other nodes in the cluster running repairs go unstable and the nodes running repairs have their sstable counts keeps on climbing and eventually we end-up in the situation above. This makes the cluster really unstable.
Is anyone facing such issues in 2.0.14 with high sstable counts and where the count keeps on climbing and eventually result in the above OOM?
Regards,Ken