You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "Wei-Chiu Chuang (Jira)" <ji...@apache.org> on 2022/07/12 04:04:00 UTC

[jira] [Commented] (HDFS-16565) DataNode holds a large number of CLOSE_WAIT connections that are not released

    [ https://issues.apache.org/jira/browse/HDFS-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565256#comment-17565256 ] 

Wei-Chiu Chuang commented on HDFS-16565:
----------------------------------------

For future reference, i highly recommend using this tool to detect file descriptor leak/socket leak: https://file-leak-detector.kohsuke.org/
Or more generally, use ByteMan or BTrace.

> DataNode holds a large number of CLOSE_WAIT connections that are not released
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-16565
>                 URL: https://issues.apache.org/jira/browse/HDFS-16565
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, ec
>    Affects Versions: 3.3.0
>         Environment: CentOS Linux release 7.5.1804 (Core)
>            Reporter: JiangHua Zhu
>            Priority: Major
>         Attachments: screenshot-1.png, screenshot-2.png
>
>
> There is a strange phenomenon here, DataNode holds a large number of connections in CLOSE_WAIT state and does not release.
> netstat -na | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
> LISTEN 20
> CLOSE_WAIT 17707
> ESTABLISHED 1450
> TIME_WAIT 12
> It can be found that the connections with the CLOSE_WAIT state have reached 17k and are still growing. View these CLOSE_WAITs through the lsof command, and get the following phenomenon:
> lsof -i tcp | grep -E 'CLOSE_WAIT|COMMAND'
>  !screenshot-1.png! 
> It can be seen that the reason for this phenomenon is that Socket#close() is not called correctly, and DataNode interacts with other nodes as Client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org