You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Yuki Morishita (JIRA)" <ji...@apache.org> on 2016/05/11 13:11:13 UTC

[jira] [Commented] (CASSANDRA-11750) Offline scrub should not abort when it hits corruption

    [ https://issues.apache.org/jira/browse/CASSANDRA-11750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280084#comment-15280084 ] 

Yuki Morishita commented on CASSANDRA-11750:
--------------------------------------------

I think on trunk this is not the case anymore after CASSANDRA-11578 which decoupled disk failure policy handling and only applied it when online.
But still there can be the case where CorruptSSTableException is handled in offline scrub that prevent it to continue.
Let me check.

> Offline scrub should not abort when it hits corruption
> ------------------------------------------------------
>
>                 Key: CASSANDRA-11750
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11750
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Adam Hattrell
>            Priority: Minor
>              Labels: Tools
>
> Hit a failure on startup due to corruption of some sstables in system keyspace.  Deleted the listed file and restarted - came down again with another file.
> Figured that I may as well run scrub to clean up all the files.  Got following error:
> {noformat}
> sstablescrub system compaction_history 
> ERROR 17:21:34 Exiting forcefully due to file system exception on startup, disk failure policy "stop" 
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-1936-CompressionInfo.db 
> at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:169) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:741) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:692) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:480) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:376) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:523) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_79] 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_79] 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79] 
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79] 
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] 
> Caused by: java.io.EOFException: null 
> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_79] 
> at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_79] 
> at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_79] 
> at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> ... 14 common frames omitted 
> {noformat}
> I guess it might be by design - but I'd argue that I should at least have the option to continue and let it do it's thing.  I'd prefer that sstablescrub ignored the disk failure policy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)