You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Will Berkeley (JIRA)" <ji...@apache.org> on 2018/01/03 17:54:04 UTC

[jira] [Commented] (KUDU-2247) Update doc on the 'kudu tablet change_config move_replica' after 3-4-3 enabled by default

    [ https://issues.apache.org/jira/browse/KUDU-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309982#comment-16309982 ] 

Will Berkeley commented on KUDU-2247:
-------------------------------------

bq. Also, it make sense to update the CLI tool to accept only the source replica as an argument

The purpose of the move_replica tool is to allow a user to move a replica from one server to another. If we just specify the server a replica is removed from then it's not different than remove_replica. I think the situation described here is rare and detectable by the tool, so until a solution is available we can just run a final check that the end state is the state we set out to achieve, and if not we can, say, try again, or give a helpful message about why things went awry.

> Update doc on the 'kudu tablet change_config move_replica' after 3-4-3 enabled by default
> -----------------------------------------------------------------------------------------
>
>                 Key: KUDU-2247
>                 URL: https://issues.apache.org/jira/browse/KUDU-2247
>             Project: Kudu
>          Issue Type: Task
>          Components: consensus, documentation, ops-tooling
>    Affects Versions: 1.7.0
>            Reporter: Alexey Serbin
>            Assignee: Alexey Serbin
>
> The way how the replica replacement works in 3-4-3 v1 design scheme has a few corner cases for very specific run-time scenarios.  That's due to the absence of the {{SUPERCEDES}} attribute, which is in the full 3-4-3 design proposal (option E), but not in 3-4-3 v1 design.
> * Initial configuration is {{[ A(V:\+\:), B(V:\+\:), C(V:\+\:) ]}}
> * A voter replica {{A}} marked with the {{REPLACE}} attribute: {{[ A(V:\+:REPLACE=true), B(V:\+\:), C(V:\+\:) ]}}
> * A non-voter replica {{X}} is added to replace replica {{A}}: {{[ A(V:\+:REPLACE=true), B(V:\+\:), C(V:\+\:), X(N:\+:PROMOTE=true) ]}}
> * Replica {{B}} fails, so the system adds another non-voter replica {{Y}} to replace the failed replica: {{[ A(V:\+:REPLACE=true), B(V:\-\:), C(V:\+\:), X(N:\+:PROMOTE=true), Y(N:\+:PROMOTE=true) ]}}
> * After some time, before replica tablet copying is complete for either of two replicas {{X}} or {{Y}}, replica {{B}} is back, so the system evicts replica {{X}}: {{[ A(V:\+:REPLACE=true), B(V:\+\:), C(V:\+\:), Y(N:\+:PROMOTE=true) ]}}
> * Eventually, replica {{Y}} completes copying the data, catches up with the leader and is promoted by the leader replica: {{[ A(V:\+:REPLACE=true), B(V:+\:), C(V:\+\:), Y(V:\+\:) ]}}
> * Next step is removing replica {{A}}, so the result configuration is {{[ B(V:\+\:), C(V:\+\:), Y(V:\+\:) ]}} instead of the expected {{[ B(V:\+\:), C(V:\+\:), X(V:\+\:) ]}}
> In this context, it's necessary to document that the 'target' replica is not the guaranteed destination of the replica move process, but just a pivot.  Also, it make sense to update the CLI tool to accept only the source replica as an argument, where the target replica is selected by the system itself (if it's not so already).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)