You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Rob Coli <rc...@palominodb.com> on 2012/10/30 20:18:58 UTC

Re: Wrong data after rolling restart

On Mon, May 21, 2012 at 7:08 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:
> Here are my 2 nodes starting logs, I hop it can help...
>
> https://gist.github.com/2762493
> https://gist.github.com/2762495

I see in these logs that you replay 2 mutations per node, despite
doing nodetool drain before restarting. However 2 replayed mutations
per node is unlikely to corrupt a significant number of counters.

As a nodetool drain is supposed to drain the commitlog entirely, you
are encountering :

https://issues.apache.org/jira/browse/CASSANDRA-4446

I also see that you are running 1.0.7. You are unlikely to receive any
useful response from the project if you file this behavior as a bug
against 1.0.7. If you restore your backup, you might wish to upgrade
to 1.0.12 before doing so, in case this is an issue fixed in the
interim.

>> I wanted to try a new config. After doing a rolling restart I have all
>> my counters false, with wrong values. I stopped my servers with the
>> following :
> [ snip ]
>> And after restarting the second one I have lost all the consistency of
>> my data. All my statistics since September are totally false now in
>> production.

What does "totally false" mean? The most common inaccuracy of
Cassandra Counters is that they slightly overcount, not that they are
"totally false" in other ways.

Did you repair this cluster at any time?

>> 1 - How to fix it ? (I have a backup from this morning, but I will
>> lose all the data after this date if I restore this backup)

Restoring this backup is the only way you are likely to fix this. Once
counters are corrupt/wrong you have no chance to survive make your
time. Restoring this backup may not even fix it permanently, depending
on what unknown cause is to blame.

>> 2 - What happened ? How to avoid it ?

Distributed counting has meaningful edge cases, and Cassandra Counters
do not cover 100% of them. As such, I recommend not using them if
accuracy is critically important.

=Rob

-- 
=Robert Coli
AIM&GTALK - rcoli@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb