You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Ethan Rose (Jira)" <ji...@apache.org> on 2023/08/08 19:43:00 UTC
[jira] [Resolved] (HDDS-3214) Unhealthy datanodes repeatedly participate in pipeline creation
[ https://issues.apache.org/jira/browse/HDDS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Rose resolved HDDS-3214.
------------------------------
Resolution: Fixed
Resolving this based on a few things:
1. Volume failure should not cause a datanode to be unhealthy, so the premise of the test is incorrect.
2. Volume failure detection has been greatly improved in the 3 years since this jira was filed, which would remove the unhealthy volume from the datanode so it could continue functioning with existing volumes.
> Unhealthy datanodes repeatedly participate in pipeline creation
> ---------------------------------------------------------------
>
> Key: HDDS-3214
> URL: https://issues.apache.org/jira/browse/HDDS-3214
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Nilotpal Nandi
> Assignee: Ethan Rose
> Priority: Blocker
> Labels: TriagePending, fault_injection
>
> steps taken :
> 1) Mounted noise injection FUSE on all datanodes
> 2) Selected 1 datanode from each open pipeline (factor=3)
> 3) Injected WRITE FAILURE noise with error code - ENOENT on "hdds.datanode.dir" path of list of datanodes selected in step 2)
> 4) start PUT key operation of size 32 MB.
>
> Observation :
> ----------------
> # Commit failed, pipelines were moved to exclusion list.
> # Client retries , new pipeline is created with same set of datanodes. Container creation fails as WRITE FAILURE injection present.
> # Pipeline is closed and the process is repeated for "ozone.client.max.retries" retries.
> Everytime, same set of datanodes are selected for pipeline creation which include 1 unhealthy datanode.
> Expectation - pipeline should have been created by selecting 3 healthy datanodes available.
>
> cc - [~ljain]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org