You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Wei-Chiu Chuang (Jira)" <ji...@apache.org> on 2021/03/12 01:30:00 UTC
[jira] [Created] (HDDS-4970) Significant overhead when DataNode is
over-scribed
Wei-Chiu Chuang created HDDS-4970:
-------------------------------------
Summary: Significant overhead when DataNode is over-scribed
Key: HDDS-4970
URL: https://issues.apache.org/jira/browse/HDDS-4970
Project: Apache Ozone
Issue Type: Bug
Components: Ozone Datanode
Affects Versions: 1.0.0
Reporter: Wei-Chiu Chuang
Attachments: Screen Shot 2021-03-11 at 11.58.23 PM.png
Ran a microbenchmark to have concurrent clients reading chunks from a DataNode.
When the number of clients grows, there is a significant amount of overhead in accessing a concurrent hash map. The overhead grows exponentially with respect to the number of clients.
{code:java|title=ChunkUtils#processFileExclusively}
@VisibleForTesting
static <T> T processFileExclusively(Path path, Supplier<T> op) {
for (;;) {
if (LOCKS.add(path)) {
break;
}
}
try {
return op.get();
} finally {
LOCKS.remove(path);
}
}
{code}
In my test, having 64 concurrent clients reading chunks from a 1-disk DataNode caused the DN to spend nearly half of the time adding into the LOCKS object (a concurrent hash map).
!Screen Shot 2021-03-11 at 11.58.23 PM.png|width=640!
Given that it is not uncommon to find HDFS DataNodes with tens of thousands of incoming client connections, I expect to see similar traffic to an Ozone DataNode at scale.
We should fix this code.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org