You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Xu Shao Hong (Jira)" <ji...@apache.org> on 2021/11/02 03:05:00 UTC

[jira] [Updated] (HDDS-5916) DNs in pipeline raft group get stuck in infinite leader election in Kubernets env

     [ https://issues.apache.org/jira/browse/HDDS-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xu Shao Hong updated HDDS-5916:
-------------------------------
    Attachment: wecom-temp-5c5afba22bfcf188415ad622f82f66af.png

> DNs in pipeline raft group get stuck in infinite leader election in Kubernets env
> ---------------------------------------------------------------------------------
>
>                 Key: HDDS-5916
>                 URL: https://issues.apache.org/jira/browse/HDDS-5916
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Xu Shao Hong
>            Priority: Critical
>         Attachments: wecom-temp-096bc77af479d5e6c280bbcaa35b7fe5.png, wecom-temp-56d8d0bcd030797a228dbb32e0dfa0f1.png, wecom-temp-5c5afba22bfcf188415ad622f82f66af.png
>
>
> During the chaos test, 10% DNs were killed to mimic the possible accident. 
> Env:
> kubernetes+ PV
>  
> Phenomenon:
> The key writing rate sharply reduces and was inclined to be a horizontal line. 
> Even after the chaos injection was recovered, the rate kept still.
> In addition, the scm_pipeline_metrics_num_pipeline_allocated metrics showed the periodic creation of new pipelines endlessly. 
> Datanodes were holding leader elections continuously, and cannot become stable after the leader was elected.
>  
> Reason:
> The DN pods were killed once and the IP of each revived pod might not have the same IP address as previous. SCM can receive heartbeats from them and treat them as normal due to the invariance of DN UUID with PV. The SCM currently does not update IP in the DatanodeDetails, thus it would transfer wrong info for the datanodes in the newly allocated pipeline. 
> In the raft group,  for example,  three raft peers are  ABC respectively.  A was revived and had a new IP address. A could contact BC, but BC could not contact A. Thus A would never receive the heartbeats from leader B or C and get stuck in the transition of follower and candidate.  Each time A become the candidate, it will increase the term, raise the leader election and send it successfully to BC. The leader once receives the requestVote, will step down and reelect. This explains why the raft group in the pipeline never stabilize.
> Meanwhile, the short-term leader could send the ready message to the SCM, and the SCM misunderstands this pipeline is ready to write chunk, causing blocking issues.
>  
> Possible solution:
> check the datanodeDetails either by DN itself or the SCM and update IP if necessary.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org