You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "BugFinder (Jira)" <ji...@apache.org> on 2022/06/22 00:07:00 UTC
[jira] [Created] (HDFS-16639) LightWeightHashSet.resize possibly quadratic behavior could affect performance
BugFinder created HDFS-16639:
--------------------------------
Summary: LightWeightHashSet.resize possibly quadratic behavior could affect performance
Key: HDFS-16639
URL: https://issues.apache.org/jira/browse/HDFS-16639
Project: Hadoop HDFS
Issue Type: Test
Components: hdfs
Affects Versions: 3.0.0
Reporter: BugFinder
Hi,
We have been performance profiling with our in-house tools a few versions of HDFS (including 3.0.0) and we have noticed some places for possible optimizations. According to what we have seen, the method
org.apache.hadoop.hdfs.util.LightWeightHashSet.resize
has a possibly quadratic behavior (linear at the least) which might be impactful depending on which data is being stored in the instance (e.g. too many blocks to be removed like [here|https://issues.apache.org/jira/browse/HDFS-16574]). Albeit this behavior might be reasonable or even not noticeable in some cases, when under wide locks as in
FSNamesystem.reportBadBlocks *// Holding the write lock*
BlockManager.findAndMarkBlockAsCorrupt
BlockManager.markBlockAsCorrupt
BlockManager.addToInvalidates
InvalidateBlocks.add
LightWeightHashSet.add
LightWeightHashSet.expandIfNecessary
LightWeightHashSet.resize
There are several call trees that seem to end in resize and have locks. Not all of these are bad or better said, not all of these are problematic in every workload. We do not have a proposal for a solution yet, as we are doing exploratory work with our in-house tools. We believe this issue is present not only in 3.0.0 but also in more recent versions.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org