You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Dominic Williams (JIRA)" <ji...@apache.org> on 2011/06/20 11:59:47 UTC

[jira] [Reopened] (CASSANDRA-2793) SSTable "Corrupt (negative) value length encountered" exception blocks compaction.

     [ https://issues.apache.org/jira/browse/CASSANDRA-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Williams reopened CASSANDRA-2793:
-----------------------------------------


Hi the issue reported was that the sstable corruption is blocking compaction with the consequence the bucket of sstables Cassandra wants to compact just grows and you get huge cpu load (from repeated attempts at compaction and increasing read inefficiency).

I mentioned the failure my attempt to use scrub to fix the corruption as an addendum only and probably it would have been better to put in into its own issue(?) Although btw yes the trace reports that scub has succeeded, the trace also shows that it has just skipped the corrupted row so in fact it hasn't solved the problem at all. 

The corruption itself is also an issue. I break these out into separate issues?

> SSTable "Corrupt (negative) value length encountered" exception blocks compaction.
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2793
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2793
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.6
>         Environment: Ubuntu
>            Reporter: Dominic Williams
>
> A node was consistently experiencing high CPU load. Examination of the logs showed that compaction of an sstable was failing with an error:
>  INFO [CompactionExecutor:1] 2011-06-17 00:18:51,676 CompactionManager.java (line 395) Compacting [SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6993-Data.db'),SSTableReader(
> path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6994-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6995-Data.db'),SSTableReader(path='/var/opt/cassandra
> /data/FightMyMonster/UserMonsters-f-6996-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6998-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/Use
> rMonsters-f-7000-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7002-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7004-Data.db
> '),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7006-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7008-Data.db'),SSTableReader(path='/
> var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7010-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7012-Data.db'),SSTableReader(path='/var/opt/cassandra/data/F
> ightMyMonster/UserMonsters-f-7014-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7016-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonste
> rs-f-7018-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7020-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7022-Data.db'),SSTa
> bleReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7024-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7026-Data.db'),SSTableReader(path='/var/opt
> /cassandra/data/FightMyMonster/UserMonsters-f-7028-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7030-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyM
> onster/UserMonsters-f-7032-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7034-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-70
> 36-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7038-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7040-Data.db'),SSTableRead
> er(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7042-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7044-Data.db'),SSTableReader(path='/var/opt/cassan
> dra/data/FightMyMonster/UserMonsters-f-7046-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7048-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7050-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7052-Data.db')]
> ERROR [CompactionExecutor:1] 2011-06-17 00:19:21,446 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[CompactionExecutor:1,1,main]
> java.io.IOError: java.io.IOException: Corrupt (negative) value length encountered        at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:252)
>         at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268)
>         at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227)        at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
>         at java.util.concurrent.ConcurrentSkipListMap.<init>(ConcurrentSkipListMap.java:1443)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379)        at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322)
>         at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:201)
>         at org.apache.cassandra.io.PrecompactedRow.<init>(PrecompactedRow.java:78)
>         at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:154)        at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:110)
>         at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:45)
>         at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>         at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
>         at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
>         at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:448)
>         at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124)
>         at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Corrupt (negative) value length encountered
>         at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:315)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:99)
>         at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:248)
>         ... 26 more
> Scrub was run on the keyspace (as a last ditch measure) but this did not work:
>  INFO [CompactionExecutor:1] 2011-06-17 00:43:42,023 CompactionManager.java (line 511) Scrubbing SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7494-Data.db')
>  INFO [CompactionExecutor:1] 2011-06-17 00:43:43,317 CompactionManager.java (line 652) Scrub of SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7494-Data.db') complete: 379 row
> s in new sstable and 0 empty (tombstoned) rows dropped
>  INFO [CompactionExecutor:1] 2011-06-17 00:43:43,317 CompactionManager.java (line 511) Scrubbing SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6994-Data.db')
>  WARN [CompactionExecutor:1] 2011-06-17 00:43:44,516 CompactionManager.java (line 606) Non-fatal error reading row (stacktrace follows)
> java.io.IOError: java.io.IOException: Corrupt (negative) value length encountered
>         at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:252)
>         at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268)
>         at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227)
>         at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
>         at java.util.concurrent.ConcurrentSkipListMap.<init>(ConcurrentSkipListMap.java:1443)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322)
>         at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>         at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:201)
>         at org.apache.cassandra.io.PrecompactedRow.<init>(PrecompactedRow.java:78)
>         at org.apache.cassandra.db.CompactionManager.getCompactedRow(CompactionManager.java:783)
>         at org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:590)
>         at org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
>         at org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Corrupt (negative) value length encountered
>         at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:315)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:99)
>         at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:248)
>         ... 19 more
>  WARN [CompactionExecutor:1] 2011-06-17 00:43:44,517 CompactionManager.java (line 640) Row at 9517800 is unreadable; skipping to next
>  INFO [CompactionExecutor:1] 2011-06-17 00:43:45,073 CompactionManager.java (line 652) Scrub of SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6994-Data.db') complete: 1029 rows in new sstable and 0 empty (tombstoned) rows dropped
>  WARN [CompactionExecutor:1] 2011-06-17 00:43:45,073 CompactionManager.java (line 654) Unable to recover 1 rows that were skipped.  You can attempt manual recovery from the pre-scrub snapshot.  You can also run nodetool repair to transfer the data from a healthy replica, if any

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira