You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Paulo Ricardo Motta Gomes <pa...@chaordicsystems.com> on 2014/11/21 13:30:03 UTC

Re: Repair/Compaction Completion Confirmation

Hey guys,

Just reviving this thread. In case anyone is using the
cassandra_range_repair tool (https://github.com/BrianGallew/cassandra_range_
repair), please sync your repositories because the tool was not working
before due to a critical bug on the token range definition method. For more
information on the bug please check here:
https://github.com/BrianGallew/cassandra_range_repair/pull/18

Cheers,

On Tue, Oct 28, 2014 at 7:53 AM, Colin <co...@clark.ws> wrote:

> When I use virtual nodes, I typically use a much smaller number - usually
> in the range of 10.  This gives me the ability to add nodes easier without
> the performance hit.
>
>
>
> --
> *Colin Clark*
> +1-320-221-9531
>
>
> On Oct 28, 2014, at 10:46 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:
>
> I have been trying this yesterday too.
>
> https://github.com/BrianGallew/cassandra_range_repair
>
> "Not 100% bullet proof" --> Indeed I found that operations are done
> multiple times, so it is not very optimised. Though it is open sourced so
> I guess you can improve things as much as you want and contribute. Here is
> the issue I raised yesterday
> https://github.com/BrianGallew/cassandra_range_repair/issues/14.
>
> I am also trying to improve our repair automation since we now have
> multiple DC and up to 800 GB per node. Repairs are quite heavy right now.
>
> Good luck,
>
> Alain
>
> 2014-10-28 4:59 GMT+01:00 Ben Bromhead <be...@instaclustr.com>:
>
>> https://github.com/BrianGallew/cassandra_range_repair
>>
>> This breaks down the repair operation into very small portions of the
>> ring as a way to try and work around the current fragile nature of repair.
>>
>> Leveraging range repair should go some way towards automating repair
>> (this is how the automatic repair service in DataStax opscenter works, this
>> is how we perform repairs).
>>
>> We have had a lot of success running repairs in a similar manner against
>> vnode enabled clusters. Not 100% bullet proof, but way better than nodetool
>> repair
>>
>>
>>
>> On 28 October 2014 08:32, Tim Heckman <ti...@pagerduty.com> wrote:
>>
>>> On Mon, Oct 27, 2014 at 1:44 PM, Robert Coli <rc...@eventbrite.com>
>>> wrote:
>>>
>>>> On Mon, Oct 27, 2014 at 1:33 PM, Tim Heckman <ti...@pagerduty.com> wrote:
>>>>
>>>>> I know that when issuing some operations via nodetool, the command
>>>>> blocks until the operation is finished. However, is there a way to reliably
>>>>> determine whether or not the operation has finished without monitoring that
>>>>> invocation of nodetool?
>>>>>
>>>>> In other words, when I run 'nodetool repair' what is the best way to
>>>>> reliably determine that the repair is finished without running something
>>>>> equivalent to a 'pgrep' against the command I invoked? I am curious about
>>>>> trying to do the same for major compactions too.
>>>>>
>>>>
>>>> This is beyond a FAQ at this point, unfortunately; non-incremental
>>>> repair is awkward to deal with and probably impossible to automate.
>>>>
>>>> In The Future [1] the correct solution will be to use incremental
>>>> repair, which mitigates but does not solve this challenge entirely.
>>>>
>>>> As brief meta commentary, it would have been nice if the project had
>>>> spent more time optimizing the operability of the critically important
>>>> thing you must do once a week [2].
>>>>
>>>> https://issues.apache.org/jira/browse/CASSANDRA-5483
>>>>
>>>> =Rob
>>>> [1] http://www.datastax.com/dev/blog/anticompaction-in-cassandra-2-1
>>>> [2] Or, more sensibly, once a month with gc_grace_seconds set to 34
>>>> days.
>>>>
>>>
>>> Thank you for getting back to me so quickly. Not the answer that I was
>>> secretly hoping for, but it is nice to have confirmation. :)
>>>
>>> Cheers!
>>> -Tim
>>>
>>
>>
>>
>> --
>>
>> Ben Bromhead
>>
>> Instaclustr | www.instaclustr.com | @instaclustr
>> <http://twitter.com/instaclustr> | +61 415 936 359
>>
>
>


-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br <http://www.chaordic.com.br/>*
+55 48 3232.3200

Re: Repair/Compaction Completion Confirmation

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

I noticed (and reported) a bug that made me drop this tool -->
https://github.com/BrianGallew/cassandra_range_repair/issues/16

Might this be related somehow ?

C*heers

Alain

2014-11-21 13:30 GMT+01:00 Paulo Ricardo Motta Gomes <
paulo.motta@chaordicsystems.com>:

> Hey guys,
>
> Just reviving this thread. In case anyone is using the
> cassandra_range_repair tool (
> https://github.com/BrianGallew/cassandra_range_repair), please sync your
> repositories because the tool was not working before due to a critical bug
> on the token range definition method. For more information on the bug
> please check here:
> https://github.com/BrianGallew/cassandra_range_repair/pull/18
>
> Cheers,
>
> On Tue, Oct 28, 2014 at 7:53 AM, Colin <co...@clark.ws> wrote:
>
>> When I use virtual nodes, I typically use a much smaller number - usually
>> in the range of 10.  This gives me the ability to add nodes easier without
>> the performance hit.
>>
>>
>>
>> --
>> *Colin Clark*
>> +1-320-221-9531
>>
>>
>> On Oct 28, 2014, at 10:46 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:
>>
>> I have been trying this yesterday too.
>>
>> https://github.com/BrianGallew/cassandra_range_repair
>>
>> "Not 100% bullet proof" --> Indeed I found that operations are done
>> multiple times, so it is not very optimised. Though it is open sourced
>> so I guess you can improve things as much as you want and contribute. Here
>> is the issue I raised yesterday
>> https://github.com/BrianGallew/cassandra_range_repair/issues/14.
>>
>> I am also trying to improve our repair automation since we now have
>> multiple DC and up to 800 GB per node. Repairs are quite heavy right now.
>>
>> Good luck,
>>
>> Alain
>>
>> 2014-10-28 4:59 GMT+01:00 Ben Bromhead <be...@instaclustr.com>:
>>
>>> https://github.com/BrianGallew/cassandra_range_repair
>>>
>>> This breaks down the repair operation into very small portions of the
>>> ring as a way to try and work around the current fragile nature of repair.
>>>
>>> Leveraging range repair should go some way towards automating repair
>>> (this is how the automatic repair service in DataStax opscenter works, this
>>> is how we perform repairs).
>>>
>>> We have had a lot of success running repairs in a similar manner against
>>> vnode enabled clusters. Not 100% bullet proof, but way better than nodetool
>>> repair
>>>
>>>
>>>
>>> On 28 October 2014 08:32, Tim Heckman <ti...@pagerduty.com> wrote:
>>>
>>>> On Mon, Oct 27, 2014 at 1:44 PM, Robert Coli <rc...@eventbrite.com>
>>>> wrote:
>>>>
>>>>> On Mon, Oct 27, 2014 at 1:33 PM, Tim Heckman <ti...@pagerduty.com>
>>>>> wrote:
>>>>>
>>>>>> I know that when issuing some operations via nodetool, the command
>>>>>> blocks until the operation is finished. However, is there a way to reliably
>>>>>> determine whether or not the operation has finished without monitoring that
>>>>>> invocation of nodetool?
>>>>>>
>>>>>> In other words, when I run 'nodetool repair' what is the best way to
>>>>>> reliably determine that the repair is finished without running something
>>>>>> equivalent to a 'pgrep' against the command I invoked? I am curious about
>>>>>> trying to do the same for major compactions too.
>>>>>>
>>>>>
>>>>> This is beyond a FAQ at this point, unfortunately; non-incremental
>>>>> repair is awkward to deal with and probably impossible to automate.
>>>>>
>>>>> In The Future [1] the correct solution will be to use incremental
>>>>> repair, which mitigates but does not solve this challenge entirely.
>>>>>
>>>>> As brief meta commentary, it would have been nice if the project had
>>>>> spent more time optimizing the operability of the critically important
>>>>> thing you must do once a week [2].
>>>>>
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-5483
>>>>>
>>>>> =Rob
>>>>> [1] http://www.datastax.com/dev/blog/anticompaction-in-cassandra-2-1
>>>>> [2] Or, more sensibly, once a month with gc_grace_seconds set to 34
>>>>> days.
>>>>>
>>>>
>>>> Thank you for getting back to me so quickly. Not the answer that I was
>>>> secretly hoping for, but it is nice to have confirmation. :)
>>>>
>>>> Cheers!
>>>> -Tim
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Ben Bromhead
>>>
>>> Instaclustr | www.instaclustr.com | @instaclustr
>>> <http://twitter.com/instaclustr> | +61 415 936 359
>>>
>>
>>
>
>
> --
> *Paulo Motta*
>
> Chaordic | *Platform*
> *www.chaordic.com.br <http://www.chaordic.com.br/>*
> +55 48 3232.3200
>