You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "BugFinder (Jira)" <ji...@apache.org> on 2022/06/22 15:14:00 UTC

[jira] [Commented] (HDFS-16639) LightWeightHashSet.resize possibly quadratic behavior could affect performance

    [ https://issues.apache.org/jira/browse/HDFS-16639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557512#comment-17557512 ] 

BugFinder commented on HDFS-16639:
----------------------------------

Looking at my in-house tool logs and doing our best to correlate to some possible cases, we have seen that this operation *might be a contributing factor* (just a factor, since we cannot claim that the whole problem is due to this) in the following cases (all of these are call trees that start taking some wide lock and end in this resize operation. I believe that this operation was a contributing factor to the associated issues, {*}maybe not the main one but adding a few seconds to something{*}):

"Ops" paths:

method (lock) (possibly related report)(status)
 * org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication (write lock) (HDFS-16070) (open)
 * org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete (write lock) (HDFS-13831) (resolved)
 * org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile (write lock) (HDFS-14366) (resolved)
 * org.apache.hadoop.hdfs.server.namenode.FSDirConcatOp.concat (write lock) (None) (none)
 * and in general whatever Op that goes thorugh BlockManager.setReplication, BlockManager.removeBlockFromMap, InvalidateBlocks.remove or BlockManager.removeStaleReplicas

Other non "Ops" paths include
 * org.apache.hadoop.hdfs.server.namenode.FSNamesystem.handleHeartbeat (read lock) (HDFS-16613) (resolved)
 * org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode (write lock) (HDFS-14186) (reopened)
 * org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits (write lock) (None) (none)

> LightWeightHashSet.resize possibly quadratic behavior could affect performance
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-16639
>                 URL: https://issues.apache.org/jira/browse/HDFS-16639
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.0.0, 3.3.3
>            Reporter: BugFinder
>            Priority: Major
>
> Hi,
> We have been performance/scale profiling with our in-house tools a few versions of HDFS (including 3.0.0 and 3.3.3) and we have noticed some places for possible optimizations. According to what we have seen, the method 
> org.apache.hadoop.hdfs.util.LightWeightHashSet.resize
> has a possibly quadratic behavior (linear at the least) which might be impactful depending on which data is being stored in the instance (e.g. too many blocks to be removed like here https://issues.apache.org/jira/browse/HDFS-16574). Albeit this behavior might be reasonable or even not noticeable in some cases, when under wide locks as in
> FSNamesystem.reportBadBlocks *// Holding the write lock*
>   BlockManager.findAndMarkBlockAsCorrupt
>     BlockManager.markBlockAsCorrupt
>      BlockManager.addToInvalidates
>         InvalidateBlocks.add
>           LightWeightHashSet.add
>             LightWeightHashSet.expandIfNecessary
>               LightWeightHashSet.resize
> Could become an issue and a possible source of performance degradations.
> There are several call trees that seem to end in resize and have locks, thus making an improvement there could uplift NN performance in many cases. Of course, not all of these are bad, or better said, not all of these are problematic in every workload. We do not have a proposal for a solution yet, as we are doing exploratory work with our in-house tools. We believe this issue is present not only in 3.0.0 and 3.3.3 but also in other versions. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org