You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Sean Chow (Jira)" <ji...@apache.org> on 2020/06/09 14:02:00 UTC

[jira] [Created] (HDFS-15402) Requesting http jmx metrics leads to too much CLOSE-WAIT on datanode

Sean Chow created HDFS-15402:
--------------------------------

             Summary: Requesting http jmx metrics leads to too much CLOSE-WAIT on datanode
                 Key: HDFS-15402
                 URL: https://issues.apache.org/jira/browse/HDFS-15402
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: metrics
    Affects Versions: 3.1.3
            Reporter: Sean Chow


We access  {{http://127.0.0.1:50075/jmx}}  to get datanode metrics periodically. But there is too much CLOSE-WAIT socket state that lead the normal webhdfs request failed.

 
{code:java}
$ ss -ant|grep 127.0.0.1:50075 |grep CLOSE-WAIT |head -10
CLOSE-WAIT 122 0 127.0.0.1:50075 127.0.0.1:37296 
CLOSE-WAIT 122 0 127.0.0.1:50075 127.0.0.1:26499 
CLOSE-WAIT 122 0 127.0.0.1:50075 127.0.0.1:47470 
CLOSE-WAIT 122 0 127.0.0.1:50075 127.0.0.1:42852 
CLOSE-WAIT 122 0 127.0.0.1:50075 127.0.0.1:40281
$ ss -ant|grep 127.0.0.1:50075 |grep CLOSE-WAIT | wc -l 
6729
lsof -i:37296
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 101015 hdfs 3044u IPv4 271157177 0t0 TCP localhost:50075->localhost:37296 (CLOSE_WAIT)
{code}
 

The pid 101015 is the datanode's process id.

I use {{cdh6.1.1}} and {{apache-hadoop-3.1.3}} in my production, and both of them have the same issue. When the metric retriving script stop, the number of CLOSE-WAIT does not increase anymore.

 The version apache-hadoop-2.9.2 does not have this issue with the same retriving metric script.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org