You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ran Tavory <ra...@gmail.com> on 2010/04/14 13:04:28 UTC

server crash - how to invertigate

I'm running a 0.6.0 cluster with four nodes and one of them just crashed.

The logs all seem normal and I haven't seen anything special in the jmx
counters before the crash.

I have one client writing and reading using 10 threads and using 3 different
column families: KvAds, KvImpressions and KvUsers

the client had got a few UnavailableException, TimedOutException and
TTransportException but was able to complete the read/write operation by
failing over to another available host. I can't tell if the exceptions were
from the crashed host or from other hosts in the ring.

Any hints how to investigate this are greatly appreciated. So far I'm
lost...

Here's a snippet from the log just before it went down. It doesn't seem to
have anything special in it, everything is INFO level.

The only thing that seems a bit strange is that last message: Compacting [].
This message usually comes with things inside the [], such as Compacting
[org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/system/LocationInfo-1-Data.db'),...]
but this time it was just empty.
However, this is not the only place in the log were I see an empty
Compacting []. There are other places and they didn't end up in a crash, so
I don't know if it's related.

here's the log:
 INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,014 ColumnFamilyStore.java
(line 357) KvImpressions has reached its threshold; switching in a fresh
Memtable at
CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238432773.log',
position=68606651)
 INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,015 ColumnFamilyStore.java
(line 609) Enqueuing flush of Memtable(KvImpressions)@258729366
 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:07,015 Memtable.java (line 148)
Writing Memtable(KvImpressions)@258729366
 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:10,130 Memtable.java (line 162)
Completed flushing
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-24-Data.db
 INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:10,154 CommitLog.java (line 407)
Discarding obsolete commit
log:CommitLogSegment(/outbrain/cassdata/commitlog/CommitLog-1271238049425.log)
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,415
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-16-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,440
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvAds-8-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,454
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvAds-10-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,526
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-5-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,585
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-11-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,602
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvAds-11-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,614
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvAds-9-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,682
SSTableDeletingReference.java (line 104) Deleted
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-21-Data.db
 INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:52,254 CommitLogSegment.java
(line 50) Creating new commitlog segment
/outbrain/cassdata/commitlog/CommitLog-1271238952254.log
 INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,347 ColumnFamilyStore.java
(line 357) KvImpressions has reached its threshold; switching in a fresh
Memtable at
CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238952254.log',
position=47568158)
 INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,348 ColumnFamilyStore.java
(line 609) Enqueuing flush of Memtable(KvImpressions)@1955587316
 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:25,348 Memtable.java (line 148)
Writing Memtable(KvImpressions)@1955587316
 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:30,572 Memtable.java (line 162)
Completed flushing
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-25-Data.db
 INFO [COMMIT-LOG-WRITER] 2010-04-14 05:57:26,790 CommitLogSegment.java
(line 50) Creating new commitlog segment
/outbrain/cassdata/commitlog/CommitLog-1271239046790.log
 INFO [ROW-MUTATION-STAGE:7] 2010-04-14 05:57:59,513 ColumnFamilyStore.java
(line 357) KvImpressions has reached its threshold; switching in a fresh
Memtable at
CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271239046790.log',
position=24265615)
 INFO [ROW-MUTATION-STAGE:7] 2010-04-14 05:57:59,513 ColumnFamilyStore.java
(line 609) Enqueuing flush of Memtable(KvImpressions)@1617250066
 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:57:59,513 Memtable.java (line 148)
Writing Memtable(KvImpressions)@1617250066
 INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:58:04,604 Memtable.java (line 162)
Completed flushing
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-26-Data.db
 INFO [COMPACTION-POOL:1] 2010-04-14 05:58:04,605 CompactionManager.java
(line 246) Compacting
[org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-23-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-24-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-25-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-26-Data.db')]
 INFO [COMMIT-LOG-WRITER] 2010-04-14 05:58:04,652 CommitLog.java (line 407)
Discarding obsolete commit
log:CommitLogSegment(/outbrain/cassdata/commitlog/CommitLog-1271238952254.log)
 INFO [COMPACTION-POOL:1] 2010-04-14 05:58:25,975 CompactionManager.java
(line 326) Compacted to
/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-27-Data.db.
 458669533/442361674 bytes for 848169 keys.  Time: 21369ms.
 INFO [COMPACTION-POOL:1] 2010-04-14 05:59:14,913 CompactionManager.java
(line 246) Compacting []


Thanks

Re: server crash - how to invertigate

Posted by Jonathan Ellis <jb...@gmail.com>.
There's a few things it could be:

Out of memory: usually it can log the exception before dying but not
always.  there will be a java_$pid.hprof file with the heap dumped.

JVM crash: there will be hs_err$pid.log file

OS bug or hardware problem: sometimes your OS will log something

-Jonathan

On Wed, Apr 14, 2010 at 6:04 AM, Ran Tavory <ra...@gmail.com> wrote:
> I'm running a 0.6.0 cluster with four nodes and one of them just crashed.
> The logs all seem normal and I haven't seen anything special in the jmx
> counters before the crash.
> I have one client writing and reading using 10 threads and using 3 different
> column families: KvAds, KvImpressions and KvUsers
> the client had got a few UnavailableException, TimedOutException and
> TTransportException but was able to complete the read/write operation by
> failing over to another available host. I can't tell if the exceptions were
> from the crashed host or from other hosts in the ring.
> Any hints how to investigate this are greatly appreciated. So far I'm
> lost...
> Here's a snippet from the log just before it went down. It doesn't seem to
> have anything special in it, everything is INFO level.
> The only thing that seems a bit strange is that last message: Compacting [].
> This message usually comes with things inside the [], such as Compacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/system/LocationInfo-1-Data.db'),...]
> but this time it was just empty.
> However, this is not the only place in the log were I see an empty
> Compacting []. There are other places and they didn't end up in a crash, so
> I don't know if it's related.
> here's the log:
>  INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,014 ColumnFamilyStore.java
> (line 357) KvImpressions has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238432773.log',
> position=68606651)
>  INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,015 ColumnFamilyStore.java
> (line 609) Enqueuing flush of Memtable(KvImpressions)@258729366
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:07,015 Memtable.java (line 148)
> Writing Memtable(KvImpressions)@258729366
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:10,130 Memtable.java (line 162)
> Completed flushing
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-24-Data.db
>  INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:10,154 CommitLog.java (line 407)
> Discarding obsolete commit
> log:CommitLogSegment(/outbrain/cassdata/commitlog/CommitLog-1271238049425.log)
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,415
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-16-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,440
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-8-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,454
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-10-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,526
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-5-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,585
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-11-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,602
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-11-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,614
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-9-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,682
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-21-Data.db
>  INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:52,254 CommitLogSegment.java
> (line 50) Creating new commitlog segment
> /outbrain/cassdata/commitlog/CommitLog-1271238952254.log
>  INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,347 ColumnFamilyStore.java
> (line 357) KvImpressions has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238952254.log',
> position=47568158)
>  INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,348 ColumnFamilyStore.java
> (line 609) Enqueuing flush of Memtable(KvImpressions)@1955587316
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:25,348 Memtable.java (line 148)
> Writing Memtable(KvImpressions)@1955587316
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:30,572 Memtable.java (line 162)
> Completed flushing
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-25-Data.db
>  INFO [COMMIT-LOG-WRITER] 2010-04-14 05:57:26,790 CommitLogSegment.java
> (line 50) Creating new commitlog segment
> /outbrain/cassdata/commitlog/CommitLog-1271239046790.log
>  INFO [ROW-MUTATION-STAGE:7] 2010-04-14 05:57:59,513 ColumnFamilyStore.java
> (line 357) KvImpressions has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271239046790.log',
> position=24265615)
>  INFO [ROW-MUTATION-STAGE:7] 2010-04-14 05:57:59,513 ColumnFamilyStore.java
> (line 609) Enqueuing flush of Memtable(KvImpressions)@1617250066
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:57:59,513 Memtable.java (line 148)
> Writing Memtable(KvImpressions)@1617250066
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:58:04,604 Memtable.java (line 162)
> Completed flushing
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-26-Data.db
>  INFO [COMPACTION-POOL:1] 2010-04-14 05:58:04,605 CompactionManager.java
> (line 246) Compacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-23-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-24-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-25-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/outbrain_kvdb/KvImpressions-26-Data.db')]
>  INFO [COMMIT-LOG-WRITER] 2010-04-14 05:58:04,652 CommitLog.java (line 407)
> Discarding obsolete commit
> log:CommitLogSegment(/outbrain/cassdata/commitlog/CommitLog-1271238952254.log)
>  INFO [COMPACTION-POOL:1] 2010-04-14 05:58:25,975 CompactionManager.java
> (line 326) Compacted to
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-27-Data.db.
>  458669533/442361674 bytes for 848169 keys.  Time: 21369ms.
>  INFO [COMPACTION-POOL:1] 2010-04-14 05:59:14,913 CompactionManager.java
> (line 246) Compacting []
>
> Thanks