You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Daniel Doubleday <da...@gmx.net> on 2011/04/28 18:58:26 UTC

Strange corrupt sstable

Hi all

on one of our dev machines we ran into this:

INFO [CompactionExecutor:1] 2011-04-28 15:07:35,174 SSTableWriter.java (line 108) Last written key : DecoratedKey(12707736894140473154801792860916528374, 74657374)
 INFO [CompactionExecutor:1] 2011-04-28 15:07:35,174 SSTableWriter.java (line 109) Current key : DecoratedKey(9639328335678452015784116654185025244, 61c080c080c080c080c080c0803a60)
 INFO [CompactionExecutor:1] 2011-04-28 15:07:35,175 SSTableWriter.java (line 110) Writing into file /var/lib/cassandra/data/Smeet/Tracker-tmp-f-752-Data.db
ERROR [CompactionExecutor:1] 2011-04-28 15:07:35,179 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:1,1,main]
java.io.IOException: Keys must be written in ascending order.
        at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:111)
        at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:128)
        at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:451)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)


After looking into this we found that one SSTable was corrupt:

INFO 18:10:23,767 checking sstable /Users/dd/Temp/cassandra/data/Smeet/Tracker-f-21-Data.db
 INFO 18:10:23,773 DecoratedKey(12707736894140473154801792860916528374, 74657374)
 INFO 18:10:23,777 DecoratedKey(9639328335678452015784116654185025244, 61c080c080c080c080c080c0803a60)
 INFO 18:10:23,781 DecoratedKey(14745077953608664589374317792467471221, 7265706f2d333631343039)
 INFO 18:10:23,783 DecoratedKey(20959981347795136420050187639971803838, 6576656e746c6f672d333631383838)
 INFO 18:10:23,785 DecoratedKey(21549594967496343960898734735584772470, 6576656e746c6f672d333631383837)
 INFO 18:10:23,787 DecoratedKey(21604488955263763122920767208831708343, 6576656e746c6f672d333631383835)
 INFO 18:10:23,790 DecoratedKey(50100329769829417461462475380721306089, 61c080c080c080c080c080c0803a5f)
 INFO 18:10:23,792 DecoratedKey(46089976107344533463199626515706634206, 7265706f2d333631303436)
 INFO 18:10:23,794 DecoratedKey(50825113246224326722027014732583502298, 6c6f674576656e742d333631383431)
 INFO 18:10:23,797 DecoratedKey(146509745693074257726394585872920772980, 61c080c080c080c080c080c0803a59)
 INFO 18:10:23,801 DecoratedKey(63662277073629697060427801149510996510, 6c6f674576656e742d333631383432)
 INFO 18:10:23,803 DecoratedKey(79706531464725069255477768601476202854, 6576656e746c6f672d333631383834)
 INFO 18:10:23,808 DecoratedKey(99662740527281958972877086277408797556, 6c6f674576656e742d333631383430)
 INFO 18:10:23,810 DecoratedKey(24216004522287050570821311008920284627, 61c080c080c080c080c080c0803a5a)
 INFO 18:10:23,816 DecoratedKey(130819691046705791898531759480356609485, 61c080c080c080c080c080c0803a5e)
 INFO 18:10:24,121 DecoratedKey(107312182597869061361893705497084751672, 61c080c080c080c080c080c0803a58)
 INFO 18:10:24,123 DecoratedKey(142873550623537979564886292314205248041, 7265706f2d333631363731)
 INFO 18:10:24,124 DecoratedKey(147476468396567670371575440897348610391, 6576656e746c6f672d333631383836)

I couldn't figure out how SSTableWriter could possible write such a file so ... Just wanted to report. Maybe it rings a bell somewhere.

It seems really strange that some data corruption would leave the file readable for the SSTableReader.

History of this server is 0.6.5 -> 0.7.4. But this file was obviously written by 0.7.4



Cheers,
Daniel



Re: Strange corrupt sstable

Posted by mcasandra <mo...@gmail.com>.
Can someone please help understand the reason for corrupt SSTables? I am just
worried what the worst case. Do we lose data in these cases? How to protect
from data loss if that's the case.

--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-corrupt-sstable-tp6314052p6314067.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: Strange corrupt sstable

Posted by Jonathan Ellis <jb...@gmail.com>.
Thanks for tracking that down!

0.7 OPP adds additional checks (and if you're starting from scratch
you should use BOP instead) that keys are valid UTF8, so it shouldn't
be an issue there.

On Mon, May 2, 2011 at 7:39 AM, Daniel Doubleday
<da...@gmx.net> wrote:
> Just for the record:
>
> The problem had nothing to do with bad memory. After some more digging it turned out that due to a bug we wrote invalid utf-8 sequences as row keys. In 0.6 the key tokens are constructed from string decoded bytes. This does not happen anymore in 0.7 files. So what apparently happened during compaction was
>
> 1. read sst and generate string based order rows
> 2. write the new file based on that order
> 3. read the compacted file based on raw bytes order -> crash
>
> That bug never made it to production so we are fine.
>
> On Apr 29, 2011, at 10:32 AM, Daniel Doubleday wrote:
>
>> Bad == Broken
>>
>> That means you cannot rely on 1 == 1. In such a scenario everything can happen including data loss.
>> That's why you want ECC mem on production servers. Our cheapo dev boxes dont.
>>
>> On Apr 28, 2011, at 7:46 PM, mcasandra wrote:
>>
>>> What do you mean by Bad memory? Is it less heap size, OOM issues or something
>>> else? What happens in such scenario, is there a data loss?
>>>
>>> Sorry for many questions just trying to understand since data is critical
>>> afterall :)
>>>
>>> --
>>> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-corrupt-sstable-tp6314052p6314218.html
>>> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Strange corrupt sstable

Posted by Daniel Doubleday <da...@gmx.net>.
Just for the record:

The problem had nothing to do with bad memory. After some more digging it turned out that due to a bug we wrote invalid utf-8 sequences as row keys. In 0.6 the key tokens are constructed from string decoded bytes. This does not happen anymore in 0.7 files. So what apparently happened during compaction was 

1. read sst and generate string based order rows
2. write the new file based on that order
3. read the compacted file based on raw bytes order -> crash

That bug never made it to production so we are fine.
 
On Apr 29, 2011, at 10:32 AM, Daniel Doubleday wrote:

> Bad == Broken
> 
> That means you cannot rely on 1 == 1. In such a scenario everything can happen including data loss. 
> That's why you want ECC mem on production servers. Our cheapo dev boxes dont.
> 
> On Apr 28, 2011, at 7:46 PM, mcasandra wrote:
> 
>> What do you mean by Bad memory? Is it less heap size, OOM issues or something
>> else? What happens in such scenario, is there a data loss?
>> 
>> Sorry for many questions just trying to understand since data is critical
>> afterall :)
>> 
>> --
>> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-corrupt-sstable-tp6314052p6314218.html
>> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
> 


Re: Strange corrupt sstable

Posted by Daniel Doubleday <da...@gmx.net>.
Bad == Broken

That means you cannot rely on 1 == 1. In such a scenario everything can happen including data loss. 
That's why you want ECC mem on production servers. Our cheapo dev boxes dont.

On Apr 28, 2011, at 7:46 PM, mcasandra wrote:

> What do you mean by Bad memory? Is it less heap size, OOM issues or something
> else? What happens in such scenario, is there a data loss?
> 
> Sorry for many questions just trying to understand since data is critical
> afterall :)
> 
> --
> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-corrupt-sstable-tp6314052p6314218.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.


Re: Strange corrupt sstable

Posted by mcasandra <mo...@gmail.com>.
What do you mean by Bad memory? Is it less heap size, OOM issues or something
else? What happens in such scenario, is there a data loss?

Sorry for many questions just trying to understand since data is critical
afterall :)

--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-corrupt-sstable-tp6314052p6314218.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: Strange corrupt sstable

Posted by Jonathan Ellis <jb...@gmail.com>.
When I have seen this in the past it has been bad memory on the server.

On Thu, Apr 28, 2011 at 11:58 AM, Daniel Doubleday
<da...@gmx.net> wrote:
> Hi all
> on one of our dev machines we ran into this:
> INFO [CompactionExecutor:1] 2011-04-28 15:07:35,174 SSTableWriter.java (line
> 108) Last written key : DecoratedKey(12707736894140473154801792860916528374,
> 74657374)
>  INFO [CompactionExecutor:1] 2011-04-28 15:07:35,174 SSTableWriter.java
> (line 109) Current key : DecoratedKey(9639328335678452015784116654185025244,
> 61c080c080c080c080c080c0803a60)
>  INFO [CompactionExecutor:1] 2011-04-28 15:07:35,175 SSTableWriter.java
> (line 110) Writing into file
> /var/lib/cassandra/data/Smeet/Tracker-tmp-f-752-Data.db
> ERROR [CompactionExecutor:1] 2011-04-28 15:07:35,179
> AbstractCassandraDaemon.java (line 113) Fatal exception in thread
> Thread[CompactionExecutor:1,1,main]
> java.io.IOException: Keys must be written in ascending order.
>         at
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:111)
>         at
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:128)
>         at
> org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:451)
>         at
> org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124)
>         at
> org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:636)
>
> After looking into this we found that one SSTable was corrupt:
> INFO 18:10:23,767 checking sstable
> /Users/dd/Temp/cassandra/data/Smeet/Tracker-f-21-Data.db
>  INFO 18:10:23,773 DecoratedKey(12707736894140473154801792860916528374,
> 74657374)
>  INFO 18:10:23,777 DecoratedKey(9639328335678452015784116654185025244,
> 61c080c080c080c080c080c0803a60)
>  INFO 18:10:23,781 DecoratedKey(14745077953608664589374317792467471221,
> 7265706f2d333631343039)
>  INFO 18:10:23,783 DecoratedKey(20959981347795136420050187639971803838,
> 6576656e746c6f672d333631383838)
>  INFO 18:10:23,785 DecoratedKey(21549594967496343960898734735584772470,
> 6576656e746c6f672d333631383837)
>  INFO 18:10:23,787 DecoratedKey(21604488955263763122920767208831708343,
> 6576656e746c6f672d333631383835)
>  INFO 18:10:23,790 DecoratedKey(50100329769829417461462475380721306089,
> 61c080c080c080c080c080c0803a5f)
>  INFO 18:10:23,792 DecoratedKey(46089976107344533463199626515706634206,
> 7265706f2d333631303436)
>  INFO 18:10:23,794 DecoratedKey(50825113246224326722027014732583502298,
> 6c6f674576656e742d333631383431)
>  INFO 18:10:23,797 DecoratedKey(146509745693074257726394585872920772980,
> 61c080c080c080c080c080c0803a59)
>  INFO 18:10:23,801 DecoratedKey(63662277073629697060427801149510996510,
> 6c6f674576656e742d333631383432)
>  INFO 18:10:23,803 DecoratedKey(79706531464725069255477768601476202854,
> 6576656e746c6f672d333631383834)
>  INFO 18:10:23,808 DecoratedKey(99662740527281958972877086277408797556,
> 6c6f674576656e742d333631383430)
>  INFO 18:10:23,810 DecoratedKey(24216004522287050570821311008920284627,
> 61c080c080c080c080c080c0803a5a)
>  INFO 18:10:23,816 DecoratedKey(130819691046705791898531759480356609485,
> 61c080c080c080c080c080c0803a5e)
>  INFO 18:10:24,121 DecoratedKey(107312182597869061361893705497084751672,
> 61c080c080c080c080c080c0803a58)
>  INFO 18:10:24,123 DecoratedKey(142873550623537979564886292314205248041,
> 7265706f2d333631363731)
>  INFO 18:10:24,124 DecoratedKey(147476468396567670371575440897348610391,
> 6576656e746c6f672d333631383836)
> I couldn't figure out how SSTableWriter could possible write such a file so
> ... Just wanted to report. Maybe it rings a bell somewhere.
> It seems really strange that some data corruption would leave the file
> readable for the SSTableReader.
> History of this server is 0.6.5 -> 0.7.4. But this file was obviously
> written by 0.7.4
>
>
> Cheers,
> Daniel
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com