You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Dan Kinder <dk...@turnitin.com> on 2015/06/15 23:52:04 UTC

counters still inconsistent after repair

Currently on 2.1.6 I'm seeing behavior like the following:

cqlsh:walker> select * from counter_table where field = 'test';
 field | value
-------+-------
 test  |    30
(1 rows)
cqlsh:walker> select * from counter_table where field = 'test';
 field | value
-------+-------
 test  |    90
(1 rows)
cqlsh:walker> select * from counter_table where field = 'test';
 field | value
-------+-------
 test  |    30
(1 rows)

Using tracing I can see that one node has wrong data. However running
repair on this table does not seem to have done anything, I still see the
wrong value returned from this same node.

Potentially relevant facts:
- Recently upgraded to 2.1.6 from 2.0.14
- This table has ~million rows, low contention, and fairly high increment
rate

Mainly wondering:
- Is this known or expected? I know Cassandra counters have had issues but
thought by now it should be able to keep a consistent counter or at least
repair it...
- Any way to "reset" this counter?
- Any other stuff I can check?

Re: counters still inconsistent after repair

Posted by Dan Kinder <dk...@turnitin.com>.

Thanks Rob, this was helpful.

More counters will be added soon, I'll let you know if those have any
problems.

On Mon, Jun 15, 2015 at 4:32 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Mon, Jun 15, 2015 at 2:52 PM, Dan Kinder <dk...@turnitin.com> wrote:
>
>> Potentially relevant facts:
>> - Recently upgraded to 2.1.6 from 2.0.14
>> - This table has ~million rows, low contention, and fairly high increment
>> rate
>>
> Can you repro on a counter that was created after the upgrade?
>
>> Mainly wondering:
>>
>> - Is this known or expected? I know Cassandra counters have had issues
>> but thought by now it should be able to keep a consistent counter or at
>> least repair it...
>>
> All counters which haven't been written to after 2.1 "new counters" are
> still on disk as "old counters" and will remain that way until UPDATEd and
> then compacted together with all old shards. "Old counters" can exhibit
> this behavior.
>
>> - Any way to "reset" this counter?
>>
> Per Aleksey (in IRC) you can turn a replica for an old counter into a new
> counter by UPDATEing it once.
>
> In order to do that without modifying the count, you can [1] :
>
> UPDATE tablename SET countercolumn = countercolumn +0 where id = 1;
>
> The important caveat that this must be done at least once per shard, with
> one shard per RF. The only way one can be sure that all shards have been
> UPDATEd is by contacting each replica node and doing the UPDATE + 0 there,
> because local writes are preferred.
>
> To summarize, the optimal process to upgrade your pre-existing counters to
> 2.1-era "new counters" :
>
> 1) get a list of all counter keys
> 2) get a list of replicas per counter key
> 3) connect to each replica for each counter key and issue an UPDATE + 0
> for that counter key
> 4) run a major compaction
>
> As an aside, Aleksey suggests that the above process is so heavyweight
> that it may not be worth it. If you just leave them be, all counters you're
> actually used will become progressively more accurate over time.
>
> =Rob
> [1] Special thanks to Jeff Jirsa for verifying that this syntax works.
>



-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkinder@turnitin.com

Re: counters still inconsistent after repair

Posted by Robert Coli <rc...@eventbrite.com>.

On Mon, Jun 15, 2015 at 2:52 PM, Dan Kinder <dk...@turnitin.com> wrote:

> Potentially relevant facts:
> - Recently upgraded to 2.1.6 from 2.0.14
> - This table has ~million rows, low contention, and fairly high increment
> rate
>
Can you repro on a counter that was created after the upgrade?

> Mainly wondering:
>
> - Is this known or expected? I know Cassandra counters have had issues but
> thought by now it should be able to keep a consistent counter or at least
> repair it...
>
All counters which haven't been written to after 2.1 "new counters" are
still on disk as "old counters" and will remain that way until UPDATEd and
then compacted together with all old shards. "Old counters" can exhibit
this behavior.

> - Any way to "reset" this counter?
>
Per Aleksey (in IRC) you can turn a replica for an old counter into a new
counter by UPDATEing it once.

In order to do that without modifying the count, you can [1] :

UPDATE tablename SET countercolumn = countercolumn +0 where id = 1;

The important caveat that this must be done at least once per shard, with
one shard per RF. The only way one can be sure that all shards have been
UPDATEd is by contacting each replica node and doing the UPDATE + 0 there,
because local writes are preferred.

To summarize, the optimal process to upgrade your pre-existing counters to
2.1-era "new counters" :

1) get a list of all counter keys
2) get a list of replicas per counter key
3) connect to each replica for each counter key and issue an UPDATE + 0 for
that counter key
4) run a major compaction

As an aside, Aleksey suggests that the above process is so heavyweight that
it may not be worth it. If you just leave them be, all counters you're
actually used will become progressively more accurate over time.

=Rob
[1] Special thanks to Jeff Jirsa for verifying that this syntax works.