You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by David Capwell <dc...@apple.com.INVALID> on 2021/09/01 22:51:50 UTC

Re: [DISCUSS] Repair Improvement Proposal

Cool, moving this from dev list to JIRA, will start breaking down tasks and document my progress there

https://issues.apache.org/jira/browse/CASSANDRA-16909

> On Aug 27, 2021, at 1:21 PM, David Capwell <dc...@apple.com.INVALID> wrote:
> 
> Push vs pull isn’t too critical, but there is one edge case to consider; if we didn’t think the participate got restarted triggering validation again (which may have caused the process to end) could be a problem.
> 
>> On Aug 26, 2021, at 9:50 AM, Yifan Cai <yc...@gmail.com> wrote:
>> 
>>> 
>>> 2. Add retries to specific stages of coordination, such as prepare and
>>>  validate. In order to do these retries we first need to know what the
>> 
>>  state is for the participant which has yet to reply...
>> 
>> 
>> If I understand it correctly, does it mean retries only happen in the
>> coordinator and the coordinator pulls the states of the participants
>> periodically?
>> If the handling of the requests in the participant is made to be idempotent
>> (which I think is required for retry anyway), pulling the state is
>> unnecessary. For example, the coordinator can just send the PrepareRequest
>> at regular intervals until it receives the PrepareResponse.
>> 
>> - Yifan
>> 
>> On Thu, Aug 26, 2021 at 8:56 AM Blake Eggleston
>> <be...@apple.com.invalid> wrote:
>> 
>>> +1 from me, any improvement in this area would be great.
>>> 
>>> It would be nice if this could include visibility into repair streams, but
>>> just exposing the repair state will be a big improvement.
>>> 
>>>> On Aug 25, 2021, at 5:46 PM, David Capwell <dc...@gmail.com> wrote:
>>>> 
>>>> Now that 4.0 is out, I want to bring up improving repair again (earlier
>>>> thread
>>>> 
>>> http://mail-archives.apache.org/mod_mbox/cassandra-commits/201911.mbox/%3CJIRA.13266448.1572997299000.99567.1572997440168@Atlassian.JIRA%3E
>>> ),
>>>> specifically the following two JIRAs:
>>>> 
>>>> 
>>>> CASSANDRA-15566 - Repair coordinator can hang under some cases
>>>> 
>>>> CASSANDRA-15399 - Add ability to track state in repair
>>>> 
>>>> 
>>>> Right now repair has an issue if any message is lost, which leads to hung
>>>> or timed out repairs; in addition there is a large lack of visibility
>>> into
>>>> what is going on, and can be even harder if you wish to join coordinator
>>>> with participant state.
>>>> 
>>>> 
>>>> I propose the following changes to improve our current repair subsystem:
>>>> 
>>>> 
>>>> 
>>>> 1. New tracking system for coordinator and participants (covered by
>>>> CASSANDRA-15399).  This system will expose progress on each instance
>>> and
>>>> expose this information for internal access as well as external users
>>>> 2. Add retries to specific stages of coordination, such as prepare and
>>>> validate.  In order to do these retries we first need to know what the
>>>> state is for the participant which has yet to reply, this will leverage
>>>> CASSANDRA-15399 to see what's going on (has the prepare been seen?  Is
>>>> validation running? Did it complete?).  In addition to checking the
>>>> state, we will need to store the validation MerkleTree, this allows for
>>>> coordinator to fetch if goes missing (can be dropped in route to
>>>> coordinator or even on the coordinator).
>>>> 
>>>> 
>>>> What is not in scope?
>>>> 
>>>> - Rewriting all of Repair; the idea is specific "small" changes can fix
>>>> 80% of the issues
>>>> - Handle coordinator node failure.  Being able to recover from a failed
>>>> coordinator should be possible after the above work is done, so is
>>> seen as
>>>> tangental and can be done later
>>>> - Recovery from a downed participant.  Similar to the previous bullet,
>>>> with the state being tracked this acts as a kind of checkpoint, so
>>> future
>>>> work can come in to handle recovery
>>>> - Handling "too large" range. Ideally we should add an ability to split
>>>> the coordination into sub repairs, but this is not the goal of this
>>> work.
>>>> - Overstreaming.  This is a byproduct of the previous "not in scope"
>>>> bullet, and/or large partitions; so is tangental to this work
>>>> 
>>>> 
>>>> Wanted to share here before starting this work again; let me know if
>>> there
>>>> are any concerns or feedback!
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>> 
>>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org