You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Maki Watanabe <wa...@gmail.com> on 2011/04/05 07:01:40 UTC

nodetool repair & compact

Hello,
On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
nodetool repair and compact.
I believe we need to run nodetool repair regularly, and it synchronize
all replica nodes at the end.
According to the documents the "repair" invokes major compaction also
(as side effect?).
Will this "major compaction" apply on replica nodes too?

If I have 3 node ring and CF of RF=3, what should I do periodically on
this system is:
- nodetool repair on one of the nodes
or
- nodetool repair on one of the nodes, and nodetool compact on 2 of the nodes
?

Thanks
maki

Re: nodetool repair & compact

Posted by Watanabe Maki <wa...@gmail.com>.
Thanks a lot. It has became clear for me.

From iPhone


On 2011/04/06, at 23:51, Sylvain Lebresne <sy...@datastax.com> wrote:

> On Tue, Apr 5, 2011 at 9:03 PM, Maki Watanabe <wa...@gmail.com> wrote:
>> Thanks Sylvain, it's very clear.
>> But should I still need to force major compaction regularly to clear tombstones?
>> I know that minor compaction clear the tombstones after 0.7, but
>> maximumCompactionThreshold limits the maximum number of sstable which
>> will be merged at once, so to GC all tombstones in all sstable in
>> gc_grace_period, it is safe to run "nodetool compact" at least once in
>> gc_grace_period, isn't it?
> 
> You don't *need* tombstones to be cleared within gc_grace_period. What you
> need is to make sure for a given tombstone t, that each node will get t within
> gc_grace_period. This means that if a node dies, you need it to be up again
> and have nodetool repair ran before gc_grace_period, otherwise there may
> be some tombstones that this node will never see (and thus deleted data
> could be resurrected by this node).
> 
> So repair should be run at least once in gc_grace_period to be on the safe side.
> Compact is not necessary however. The only downside of not running compact
> regularly is that some tombstones may take longer to be removed (since minor
> compaction are potentially less efficient at removing them), which
> really only impact
> disk space usage. And given that major compaction are fairly heavy on ressource
> usage and have that nasty effect of producing only one huge sstable, my advice
> would be to not run major compaction unless you have good reason to suspect
> you need it.
> 
> --
> Sylvain
> 
>> 
>> maki
>> 
>> 2011/4/6 Sylvain Lebresne <sy...@datastax.com>:
>>> On Tue, Apr 5, 2011 at 12:01 AM, Maki Watanabe <wa...@gmail.com> wrote:
>>>> Hello,
>>>> On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
>>>> nodetool repair and compact.
>>>> I believe we need to run nodetool repair regularly, and it synchronize
>>>> all replica nodes at the end.
>>>> According to the documents the "repair" invokes major compaction also
>>>> (as side effect?).
>>> 
>>> Those documents are wrong then. A repair does not trigger a major
>>> compaction. The only thing that makes it similar to a major compaction is
>>> that it will iterate over all the sstables. But for instance, you won't end
>>> up with one big sstable at the end of repair as you would with a major
>>> compaction.
>>> 
>>>> Will this "major compaction" apply on replica nodes too?
>>>> 
>>>> If I have 3 node ring and CF of RF=3, what should I do periodically on
>>>> this system is:
>>>> - nodetool repair on one of the nodes
>>>> or
>>>> - nodetool repair on one of the nodes, and nodetool compact on 2 of the nodes
>>>> ?
>>> 
>>> So as said, repair and compact are independent. You should
>>> periodically run nodetool
>>> repair (on one of your nodes in your case as you said). However, it is
>>> not advised anymore
>>> to run nodetool compact regularly unless you have a good reason to.
>>> 
>>> --
>>> Sylvain
>>> 
>> 

Re: nodetool repair & compact

Posted by Sylvain Lebresne <sy...@datastax.com>.
On Tue, Apr 5, 2011 at 9:03 PM, Maki Watanabe <wa...@gmail.com> wrote:
> Thanks Sylvain, it's very clear.
> But should I still need to force major compaction regularly to clear tombstones?
> I know that minor compaction clear the tombstones after 0.7, but
> maximumCompactionThreshold limits the maximum number of sstable which
> will be merged at once, so to GC all tombstones in all sstable in
> gc_grace_period, it is safe to run "nodetool compact" at least once in
> gc_grace_period, isn't it?

You don't *need* tombstones to be cleared within gc_grace_period. What you
need is to make sure for a given tombstone t, that each node will get t within
gc_grace_period. This means that if a node dies, you need it to be up again
and have nodetool repair ran before gc_grace_period, otherwise there may
be some tombstones that this node will never see (and thus deleted data
could be resurrected by this node).

So repair should be run at least once in gc_grace_period to be on the safe side.
Compact is not necessary however. The only downside of not running compact
regularly is that some tombstones may take longer to be removed (since minor
compaction are potentially less efficient at removing them), which
really only impact
disk space usage. And given that major compaction are fairly heavy on ressource
usage and have that nasty effect of producing only one huge sstable, my advice
would be to not run major compaction unless you have good reason to suspect
you need it.

--
Sylvain

>
> maki
>
> 2011/4/6 Sylvain Lebresne <sy...@datastax.com>:
>> On Tue, Apr 5, 2011 at 12:01 AM, Maki Watanabe <wa...@gmail.com> wrote:
>>> Hello,
>>> On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
>>> nodetool repair and compact.
>>> I believe we need to run nodetool repair regularly, and it synchronize
>>> all replica nodes at the end.
>>> According to the documents the "repair" invokes major compaction also
>>> (as side effect?).
>>
>> Those documents are wrong then. A repair does not trigger a major
>> compaction. The only thing that makes it similar to a major compaction is
>> that it will iterate over all the sstables. But for instance, you won't end
>> up with one big sstable at the end of repair as you would with a major
>> compaction.
>>
>>> Will this "major compaction" apply on replica nodes too?
>>>
>>> If I have 3 node ring and CF of RF=3, what should I do periodically on
>>> this system is:
>>> - nodetool repair on one of the nodes
>>> or
>>> - nodetool repair on one of the nodes, and nodetool compact on 2 of the nodes
>>> ?
>>
>> So as said, repair and compact are independent. You should
>> periodically run nodetool
>> repair (on one of your nodes in your case as you said). However, it is
>> not advised anymore
>> to run nodetool compact regularly unless you have a good reason to.
>>
>> --
>> Sylvain
>>
>

Re: nodetool repair & compact

Posted by Maki Watanabe <wa...@gmail.com>.
Thanks Sylvain, it's very clear.
But should I still need to force major compaction regularly to clear tombstones?
I know that minor compaction clear the tombstones after 0.7, but
maximumCompactionThreshold limits the maximum number of sstable which
will be merged at once, so to GC all tombstones in all sstable in
gc_grace_period, it is safe to run "nodetool compact" at least once in
gc_grace_period, isn't it?

maki

2011/4/6 Sylvain Lebresne <sy...@datastax.com>:
> On Tue, Apr 5, 2011 at 12:01 AM, Maki Watanabe <wa...@gmail.com> wrote:
>> Hello,
>> On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
>> nodetool repair and compact.
>> I believe we need to run nodetool repair regularly, and it synchronize
>> all replica nodes at the end.
>> According to the documents the "repair" invokes major compaction also
>> (as side effect?).
>
> Those documents are wrong then. A repair does not trigger a major
> compaction. The only thing that makes it similar to a major compaction is
> that it will iterate over all the sstables. But for instance, you won't end
> up with one big sstable at the end of repair as you would with a major
> compaction.
>
>> Will this "major compaction" apply on replica nodes too?
>>
>> If I have 3 node ring and CF of RF=3, what should I do periodically on
>> this system is:
>> - nodetool repair on one of the nodes
>> or
>> - nodetool repair on one of the nodes, and nodetool compact on 2 of the nodes
>> ?
>
> So as said, repair and compact are independent. You should
> periodically run nodetool
> repair (on one of your nodes in your case as you said). However, it is
> not advised anymore
> to run nodetool compact regularly unless you have a good reason to.
>
> --
> Sylvain
>

Re: nodetool repair & compact

Posted by Sylvain Lebresne <sy...@datastax.com>.
On Tue, Apr 5, 2011 at 12:01 AM, Maki Watanabe <wa...@gmail.com> wrote:
> Hello,
> On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
> nodetool repair and compact.
> I believe we need to run nodetool repair regularly, and it synchronize
> all replica nodes at the end.
> According to the documents the "repair" invokes major compaction also
> (as side effect?).

Those documents are wrong then. A repair does not trigger a major
compaction. The only thing that makes it similar to a major compaction is
that it will iterate over all the sstables. But for instance, you won't end
up with one big sstable at the end of repair as you would with a major
compaction.

> Will this "major compaction" apply on replica nodes too?
>
> If I have 3 node ring and CF of RF=3, what should I do periodically on
> this system is:
> - nodetool repair on one of the nodes
> or
> - nodetool repair on one of the nodes, and nodetool compact on 2 of the nodes
> ?

So as said, repair and compact are independent. You should
periodically run nodetool
repair (on one of your nodes in your case as you said). However, it is
not advised anymore
to run nodetool compact regularly unless you have a good reason to.

--
Sylvain