You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Attila Doroszlai (Jira)" <ji...@apache.org> on 2020/11/17 08:04:00 UTC
[jira] [Created] (HDDS-4473) Reduce number of sortDatanodes RPC calls

Attila Doroszlai created HDDS-4473:
--------------------------------------

             Summary: Reduce number of sortDatanodes RPC calls
                 Key: HDDS-4473
                 URL: https://issues.apache.org/jira/browse/HDDS-4473
             Project: Hadoop Distributed Data Store
          Issue Type: Improvement
          Components: OM
            Reporter: Attila Doroszlai


{{KeyManagerImpl#listStatus}} and the {{sortDatanodeInPipeline}} helper method sort datanodes using individual RPC call for each key location info.

{code:title=https://github.com/apache/ozone/blob/d0aa34c4afae21538c6c6225f029c1d1c4c4bafd/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L2312-L2328}
  private void sortDatanodeInPipeline(OmKeyInfo keyInfo, String clientMachine) {
    if (keyInfo != null && clientMachine != null && !clientMachine.isEmpty()) {
      for (OmKeyLocationInfoGroup key : keyInfo.getKeyLocationVersions()) {
        key.getLocationList().forEach(k -> {
          List<DatanodeDetails> nodes = k.getPipeline().getNodes();
          if (nodes == null || nodes.isEmpty()) {
            LOG.warn("Datanodes for pipeline {} is empty",
                k.getPipeline().getId().toString());
            return;
          }
          List<String> nodeList = new ArrayList<>();
          nodes.stream().forEach(node ->
              nodeList.add(node.getUuidString()));
          try {
            List<DatanodeDetails> sortedNodes = scmClient.getBlockClient()
                .sortDatanodes(nodeList, clientMachine);
            k.getPipeline().setNodesInOrder(sortedNodes);
{code}

Problems, possible improvements:

# All location versions are processed.  Would it be enough to process only the "latest" version, which is used for read?
# Each key location is queried separately, even if the same pipeline was already updated in a previous request.  Could be improved by keeping track of processed pipelines.
# Further improvement may be possible by sending a single {{sortDatanodes}} request for all datanodes in all relevant pipelines, then creating the per-pipeline lists locally.

CC [~bharat]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org