You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/04/01 16:08:48 UTC

[GitHub] [hadoop-ozone] sodonnel commented on issue #668: HDDS-3139 Pipeline placement should max out pipeline usage

sodonnel commented on issue #668: HDDS-3139 Pipeline placement should max out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-607342420

> @sodonnel I rebase the code with minor conflicts in test class, but the test won't pass. I took a close look and made some change. But I realize the issue that I mention in the last comment about how to leverage with chooseNodeFromTopology. Wanna learn your thoughts.

I think one problem is this line:

```
datanodeDetails = nodes.stream().findAny().get();
```
The findAny method does not seem to return a random entry - so the same node is returned until it uses up its pipeline allocation.

I am also not sure about the limit calculation in getLowerLoadNodes:

```
int limit = nodes.size() * heavyNodeCriteria
/ HddsProtos.ReplicationFactor.THREE.getNumber();
```

Adding debug, I find this method starts to return an empty list when there are still available nodes to handle the pipeline.

Also in `filterViableNodes()` via the `meetCriteria()` method, nodes with more than the heavy load limit are already filtered out, so you are guaranteed your healthy node list container only nodes with the capacity to take another pipeline. So I wonder why we need to filter the nodes further.

> But I realize the issue that I mention in the last comment about how to leverage with chooseNodeFromTopology.

There seems to be some inconsistency in how we pick the nodes (not just in this PR, but in the wider code). Eg in `chooseNodeBasedOnRackAwareness()` we don't call into NetworkTopology(), but instead we use the `getNetworkLocation()` method on the `DatanodeDetails` object to find nodes that do not match the anchor's location.

Then later in `chooseNodeFromNetworkTopology()` we try to find a node where location is equal to the anchor and that is where we call into `networkTopology.chooseRandom()`. Could we not avoid that call, and avoid generating a new list of nodes and do something similar to `chooseNodeBasedOnRackAwareness()`, using the `getNetworkLocation()` method to find matching nodes. That would probably be more efficient that the current implementation.

As we are also then able to re-use the same list of healthy nodes everywhere without more filtering, maybe we could sort that list once by pipeline count in filterViableNodes or meetCriteria and then later always pick the node with the lowest load, filling the nodes up that way.

I hope this comment makes sense as it is very long.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org