You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Stephane Legay <sl...@looplogic.com> on 2014/11/20 17:36:33 UTC

Fwd: sstable usage doubles after repair

I upgraded a 2 node cluster with RF = 2  from 1.0.9 to 2.0.11. I did
rolling upgrades and upgradesstables after each upgrade. We then moved our
data to new hardware by shutting down each node, moving data to new
machine, and starting up with auto_bootstrap = false.

When all was done I ran a repair. Data went from 250GB to 400 GB per node.
A week later, I am doing another repair, data filling the 800GB drive on
each machine. Huge compaction on each node, constantly.

Where should I go from here? Will scrubbing fix the issue?

Thanks

Re: sstable usage doubles after repair

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Nov 20, 2014 at 10:29 AM, Stephane Legay <sl...@looplogic.com>
wrote:

> How should I go about inspecting SSTables?
>

sstable2json (which comes with cassandra, generally, but is packaged
separately these days)

checksstablegarbage (from pythian tools, which might help you understand
how much redundant data you have)

As a note, you could try running a major compaction and see how much of the
data is actually redundant or masked... or grep for compaction percentages
in the logs pre-and-post repair.

=Rob

Re: sstable usage doubles after repair

Posted by Stephane Legay <sl...@looplogic.com>.
Thanks for the response.

Yes, I went through 1.1, 1.2, 2.0 as rolling updates (entire cluster for
each minor version) and ran upgradesstables each time.
Yes, nodes are using the same tokens. I can see the tokens when running
nodetool ring. They're consistent with what we used to have.
Repairs were very infrequent because we do not delete data. Once every 2 or
3 months with a forced repair if a node ever went down for a period of time
greater than a few hours.

I'll keep an eye on the number of repaired rows.

How should I go about inspecting SSTables?

Thanks again.



On Thu, Nov 20, 2014 at 11:15 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Nov 20, 2014 at 8:36 AM, Stephane Legay <sl...@looplogic.com>
> wrote:
>
>> I upgraded a 2 node cluster with RF = 2  from 1.0.9 to 2.0.11. I did
>> rolling upgrades and upgradesstables after each upgrade.
>>
>
> To be clear, did you go through 1.1, and 1.2, or did you go directly from
> 1.0 to 2.0?
>
>
>> We then moved our data to new hardware by shutting down each node, moving
>> data to new machine, and starting up with auto_bootstrap = false.
>>
>
> This should not be implicated, especially if you verified the upgraded
> nodes came up with the same tokens they had before.
>
>
>> When all was done I ran a repair. Data went from 250GB to 400 GB per
>> node. A week later, I am doing another repair, data filling the 800GB drive
>> on each machine. Huge compaction on each node, constantly.
>>
>
> How frequently had you been running repair in 1.0.9? How often do you
> DELETE?
>
>
>> Where should I go from here? Will scrubbing fix the issue?
>>
>
> I would inspect the newly created SSTables from a repair and see what they
> contain. I would also look at log lines which indicate how many rows are
> being repaired, with a special eye towards whether the number of rows
> repaired each time you repair is decreasing.
>
> Also note that repair in 2.0 is serial by default, you probably want the
> old behavior, which you can get with "-par" flag.
>
> =Rob
> http://twitter.com/rcolidba
>



-- 
Stephane Legay
Co-founder and CTO
LoopLogic, LLC

slegay@looplogic.com
480-326-4080

Re: sstable usage doubles after repair

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Nov 20, 2014 at 8:36 AM, Stephane Legay <sl...@looplogic.com>
wrote:

> I upgraded a 2 node cluster with RF = 2  from 1.0.9 to 2.0.11. I did
> rolling upgrades and upgradesstables after each upgrade.
>

To be clear, did you go through 1.1, and 1.2, or did you go directly from
1.0 to 2.0?


> We then moved our data to new hardware by shutting down each node, moving
> data to new machine, and starting up with auto_bootstrap = false.
>

This should not be implicated, especially if you verified the upgraded
nodes came up with the same tokens they had before.


> When all was done I ran a repair. Data went from 250GB to 400 GB per node.
> A week later, I am doing another repair, data filling the 800GB drive on
> each machine. Huge compaction on each node, constantly.
>

How frequently had you been running repair in 1.0.9? How often do you
DELETE?


> Where should I go from here? Will scrubbing fix the issue?
>

I would inspect the newly created SSTables from a repair and see what they
contain. I would also look at log lines which indicate how many rows are
being repaired, with a special eye towards whether the number of rows
repaired each time you repair is decreasing.

Also note that repair in 2.0 is serial by default, you probably want the
old behavior, which you can get with "-par" flag.

=Rob
http://twitter.com/rcolidba