You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Zita Dombi (Jira)" <ji...@apache.org> on 2022/02/25 12:27:00 UTC

[jira] [Created] (HDDS-6379) Not deducting the STANDALONE pipelines when counting pipelines on each datanode to check the pipeline limit.

Zita Dombi created HDDS-6379:
--------------------------------

             Summary: Not deducting the STANDALONE pipelines when counting pipelines on each datanode to check the pipeline limit.
                 Key: HDDS-6379
                 URL: https://issues.apache.org/jira/browse/HDDS-6379
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Datanode, SCM
    Affects Versions: 1.2.0
            Reporter: Zita Dombi


So I found this bug when I tried to add robot tests to the ozone debug CLI, but I was able to recreate it locally. I had three datanodes and created a new pipeline with the ozone admin pipeline create command, which chose a datanode and made a STANDALONE/ONE pipeline with it. After that I stopped a datanode and waited until it had a DEAD state; after I started it again it didn't create a RATIS/THREE pipeline, even though there were three healthy datanodes and no RATIS/THREE pipeline. 
In the docker-config the ozone.scm.datanode.pipeline.limit property is set to 1 (the default is 2) due to the multi raft support. When we are trying to create the pipeline we are making a healthy datanode list where we are filtering the list based on the pipeline limit. We are calculating the currect pipeline count like this on a datanode:
{code:java}
int currentPipelineCount(DatanodeDetails datanodeDetails, int nodesRequired) {

    // Datanodes from pipeline in some states can also be considered available
    // for pipeline allocation. Thus the number of these pipeline shall be
    // deducted from total heaviness calculation.
    int pipelineNumDeductable = 0;
    Set<PipelineID> pipelines = nodeManager.getPipelines(datanodeDetails);
    for (PipelineID pid : pipelines) {
      Pipeline pipeline;
      try {
        pipeline = stateManager.getPipeline(pid);
      } catch (PipelineNotFoundException e) {
        LOG.debug("Pipeline not found in pipeline state manager during" +
            " pipeline creation. PipelineID: {}", pid, e);
        continue;
      }
      if (pipeline != null &&
          // single node pipeline are not accounted for while determining
          // the pipeline limit for dn
          pipeline.getType() == HddsProtos.ReplicationType.RATIS &&
          (RatisReplicationConfig
              .hasFactor(pipeline.getReplicationConfig(), ReplicationFactor.ONE)
              ||
              pipeline.getReplicationConfig().getRequiredNodes()
                  == nodesRequired &&
                  pipeline.getPipelineState()
                      == Pipeline.PipelineState.CLOSED)) {
        pipelineNumDeductable++;
      }
    }
    return pipelines.size() - pipelineNumDeductable;
  }
{code}
We are only deducting the RATIS replication type pipelines (due to this condition: pipeline.getType() == HddsProtos.ReplicationType.RATIS), so will count in the STANDALONE/ONE pipeline and because of that we will reach the pipeline limit on that datanode, therefore we won't create a RATIS/THREE pipeline.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org