You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/07/16 04:37:34 UTC

[GitHub] [incubator-doris] liutang123 opened a new issue #4104: No awailable BE to choose when repair

liutang123 opened a new issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104


   **Describe the bug**
   There are A, B, C and D 4 BEs.
   Partition p1 in table t has 3 replicas in A(good), B(good) and C(version miss) and D(copied from A).
   before delete replica in C, if mark D as decommissioned. FE will never delete replica in C and D, and decommission will never end.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] liutang123 commented on issue #4104: No available BE to choose when repair

Posted by GitBox <gi...@apache.org>.
liutang123 commented on issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104#issuecomment-661555852


   > > There are A, B, C and D 4 BEs.
   > > tablet t has 3 replicas in A(good), B(good) and C(version miss).
   > > If now mark D as decommissioned. FE will never delete replica in C and can not choose D to create new replica.
   > 
   > I think this case will fall into this branch:
   > 
   > https://github.com/apache/incubator-doris/blob/2de4f2471bd729e9f723ca8cdd5d9abfe69bf8eb/fe/src/main/java/org/apache/doris/catalog/Tablet.java#L476-L478
   > 
   > So it can not run to the code you modified in #4105
   Sorry, I skip a step.
   Think about follows:
   There are A, B, C and D 4 BEs.
   tablet t has 3 replicas in A(good), B(good) and C(version miss).
   Choose D to create new replica, A(good), B(good) and C(version miss), D(good).
   Now, `aliveAndVersionComplete` is equal to `replicationNum`.
   If decommission D before delete replica in C, repair will hang.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] liutang123 edited a comment on issue #4104: No available BE to choose when repair

Posted by GitBox <gi...@apache.org>.
liutang123 edited a comment on issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104#issuecomment-661555852


   > > There are A, B, C and D 4 BEs.
   > > tablet t has 3 replicas in A(good), B(good) and C(version miss).
   > > If now mark D as decommissioned. FE will never delete replica in C and can not choose D to create new replica.
   > 
   > I think this case will fall into this branch:
   > 
   > https://github.com/apache/incubator-doris/blob/2de4f2471bd729e9f723ca8cdd5d9abfe69bf8eb/fe/src/main/java/org/apache/doris/catalog/Tablet.java#L476-L478
   > 
   > So it can not run to the code you modified in #4105
   
   Sorry, decommission one BE may not encounter this bug


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] liutang123 closed issue #4104: No available BE to choose when repair

Posted by GitBox <gi...@apache.org>.
liutang123 closed issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] liutang123 edited a comment on issue #4104: No available BE to choose when repair

Posted by GitBox <gi...@apache.org>.
liutang123 edited a comment on issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104#issuecomment-661555852


   > > There are A, B, C and D 4 BEs.
   > > tablet t has 3 replicas in A(good), B(good) and C(version miss).
   > > If now mark D as decommissioned. FE will never delete replica in C and can not choose D to create new replica.
   > 
   > I think this case will fall into this branch:
   > 
   > https://github.com/apache/incubator-doris/blob/2de4f2471bd729e9f723ca8cdd5d9abfe69bf8eb/fe/src/main/java/org/apache/doris/catalog/Tablet.java#L476-L478
   > 
   > So it can not run to the code you modified in #4105
   
   Sorry, I skip a step.
   Think about follows:
   There are A, B, C and D 4 BEs.
   tablet t has 3 replicas in A(good), B(good) and C(version miss).
   Choose D to create new replica, A(good), B(good) and C(version miss), D(good).
   Now, `aliveAndVersionComplete` is equal to `replicationNum`.
   If decommission D before delete replica in C, repair will hang.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman edited a comment on issue #4104: No available BE to choose when repair

Posted by GitBox <gi...@apache.org>.
morningman edited a comment on issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104#issuecomment-661081118


   > There are A, B, C and D 4 BEs.
   > tablet t has 3 replicas in A(good), B(good) and C(version miss).
   > If now mark D as decommissioned. FE will never delete replica in C and can not choose D to create new replica.
   
   I think this case will fall into this branch:
   
   https://github.com/apache/incubator-doris/blob/2de4f2471bd729e9f723ca8cdd5d9abfe69bf8eb/fe/src/main/java/org/apache/doris/catalog/Tablet.java#L476-L478
   
   So it can not run to the code you modified in #4105 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on issue #4104: No available BE to choose when repair

Posted by GitBox <gi...@apache.org>.
morningman commented on issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104#issuecomment-661081118


   > There are A, B, C and D 4 BEs.
   > tablet t has 3 replicas in A(good), B(good) and C(version miss).
   > If now mark D as decommissioned. FE will never delete replica in C and can not choose D to create new replica.
   
   I think this case will fall into this branch:
   
   https://github.com/apache/incubator-doris/blob/2de4f2471bd729e9f723ca8cdd5d9abfe69bf8eb/fe/src/main/java/org/apache/doris/catalog/Tablet.java#L476-L478


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] liutang123 edited a comment on issue #4104: No available BE to choose when repair

Posted by GitBox <gi...@apache.org>.
liutang123 edited a comment on issue #4104:
URL: https://github.com/apache/incubator-doris/issues/4104#issuecomment-661555852


   > > There are A, B, C and D 4 BEs.
   > > tablet t has 3 replicas in A(good), B(good) and C(version miss).
   > > If now mark D as decommissioned. FE will never delete replica in C and can not choose D to create new replica.
   > 
   > I think this case will fall into this branch:
   > 
   > https://github.com/apache/incubator-doris/blob/2de4f2471bd729e9f723ca8cdd5d9abfe69bf8eb/fe/src/main/java/org/apache/doris/catalog/Tablet.java#L476-L478
   > 
   > So it can not run to the code you modified in #4105
   
   Sorry, decommission one BE may not encounter this bug.
   Please refer to the complicated situation.
   
   > The complex situation is:
   > There are A, B, C, D and E 5 BEs.
   > tabletA has replicas a1(A)(good), a2(B)(good), a3(C)(version miss), a4(D)(good)
   > tabletB has replicas b1(A)(good), b2(B)(good), b3(c)(version miss), b4(E)(good)
   > Because tabletA and tabletB are all `REPLICA_RELOCATING`, but they can not find a BE to create new replica.
   > a4 prevents D from being deleted and b4 prevents E from being deleted.
   > The repair and decommission will hang.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org