You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/08/03 07:18:00 UTC
[jira] [Work logged] (HDFS-16631) Enable dfs.datanode.lockmanager.trace In Test
[ https://issues.apache.org/jira/browse/HDFS-16631?focusedWorklogId=797522&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-797522 ]
ASF GitHub Bot logged work on HDFS-16631:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 03/Aug/22 07:17
Start Date: 03/Aug/22 07:17
Worklog Time Spent: 10m
Work Description: slfan1989 closed pull request #4438: HDFS-16631. Enable dfs.datanode.lockmanager.trace In Test.
URL: https://github.com/apache/hadoop/pull/4438
Issue Time Tracking
-------------------
Worklog Id: (was: 797522)
Time Spent: 2h 40m (was: 2.5h)
> Enable dfs.datanode.lockmanager.trace In Test
> ---------------------------------------------
>
> Key: HDFS-16631
> URL: https://issues.apache.org/jira/browse/HDFS-16631
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Reporter: fanshilun
> Assignee: fanshilun
> Priority: Minor
> Labels: pull-request-available
> Attachments: image-2022-06-18-09-49-28-725.png
>
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> In Jira HDFS-16600. Fix deadlock on DataNode side. We discussed the issue of deadlock, this is a very meaningful discussion, I was reading the log and found the following:
> {code:java}
> 2022-05-27 07:39:47,890 [Listener at localhost/36941] WARN datanode.DataSetLockManager (DataSetLockManager.java:lockLeakCheck(261)) -
> not open lock leak check func.{code}
> Looking at the code, I found that there is such a parameter:
> {code:java}
> <property>
> <name>dfs.datanode.lockmanager.trace</name>
> <value>false</value>
> <description>
> If this is true, after shut down datanode lock Manager will print all leak
> thread that not release by lock Manager. Only used for test or trace dead lock
> problem. In produce default set false, because it's have little performance loss.
> </description>
> </property> {code}
> I think this parameter should be added in the test environment, so that if there is a DN deadlock, the cause can be quickly located.
> According to suggestions, the following modifications are made:
> 1. On the read and write lock related methods of DataSetLockManager, add the operation name to clearly indicate the source of the lock, which is convenient for public use.
> 2. Increase the granularity of indicator monitoring, including the number of locks, the time of locks, and the early warning of locks.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org