You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Michael Shuler (JIRA)" <ji...@apache.org> on 2014/07/29 23:45:40 UTC

[jira] [Commented] (CASSANDRA-6829) nodes sporadically shutting down

    [ https://issues.apache.org/jira/browse/CASSANDRA-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078431#comment-14078431 ] 

Michael Shuler commented on CASSANDRA-6829:
-------------------------------------------

[~dmeyer] could you shed a little light on this from your experience?

> nodes sporadically shutting down
> --------------------------------
>
>                 Key: CASSANDRA-6829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Windows Azure VMs.
> The VMs OS is SUSE Enterprise. I striped 2 logical volumes  for each VM, one for data and one for commitlog, and formatted them as XFS.
> Oracle Java 1.7_45
> Datastax Enterprise 4.0 (Cassandra version 2.0.5.22)
>            Reporter: Oded Peer
>
> I deployed a Datastax 4.0 Cassandra cluster in Windows Azure and started load tests. After a while some of the nodes announce shutdown and stop responding to client requests.
> The error preceding the shutdown is "FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-581-Data.db"  "Caused by: java.io.IOException: Input/output error".
> The storage I'm using in my VMs is Azure Blob storage. The VMs OS is SUSE Enterprise. I striped 2 logical volumes  for each VM, one for data and one for commitlog, and formatted them as XFS.
> I am using Oracle Java 1.7_45
> Restarting the Cassandra process resolves the problem for a short while (minutes) afterwards the problem occurs again.
> I noticed that it happens only in tmp files of a specific table. See the errors from 3 random nodes:
> (1) ERROR [CompactionExecutor:48] 2014-03-09 11:38:45,188 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecutor:48,1,main]
> FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-409-Data.db
> (2) ERROR [CompactionExecutor:37] 2014-03-10 10:04:30,828 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:37,1,main]
> FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-946-Data.db
> (3) ERROR [CompactionExecutor:48] 2014-03-10 10:23:39,248 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:48,1,main]
> FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-874-Data.db
> The table is a wide-row table created as:
> CREATE TABLE event_log (
>   time_slice bigint,
>   distribution_key int,
>   event_id text,
>   ... 300 columns ...
>   PRIMARY KEY ((time_slice, distribution_key), event_id)
> ) compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> CREATE INDEX EVENT_LOG_2IX ON event_log (event_id);
> 'time_slice' represents a 5 minute time-period such as yyyyMMddHHmm where 'mm' is between 00 and 55 with increments of 5.
> The Data files under the 'data' directory got to be very big in a very short time after the test started.
> For example:
> 1.5G Mar 10 10:50 /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-jb-968-Data.db
> 3.0G Mar 10 11:41 /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-970-Data.db
> Full stack trace:
> ERROR [CompactionExecutor:37] 2014-03-10 10:04:30,828 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:37,1,main]
> FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-946-Data.db
>         at org.apache.cassandra.io.compress.CompressedSequentialWriter.close(CompressedSequentialWriter.java:270)
>         at org.apache.cassandra.io.sstable.SSTableWriter.close(SSTableWriter.java:356)
>         at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:324)
>         at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:204)
>         at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>         at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>         at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>         at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Input/output error
>         at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>         at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>         at org.apache.cassandra.io.compress.CompressionMetadata$Writer.close(CompressionMetadata.java:366)
>         at org.apache.cassandra.io.compress.CompressedSequentialWriter.close(CompressedSequentialWriter.java:266)
>         ... 13 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)