You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/01/11 10:22:00 UTC
[jira] [Created] (HDFS-13010) DataNode: Listen queue is always 128
Gopal V created HDFS-13010:
------------------------------
Summary: DataNode: Listen queue is always 128
Key: HDFS-13010
URL: https://issues.apache.org/jira/browse/HDFS-13010
Project: Hadoop HDFS
Issue Type: Bug
Components: datanode
Affects Versions: 3.0.0
Reporter: Gopal V
DFS write-heavy workloads are failing with
{code}
18/01/11 05:02:34 INFO mapreduce.Job: Task Id : attempt_1515660475578_0007_m_000387_0, Status : FAILED
Error: java.io.IOException: Could not get block locations. Source file "/tmp/tpcds-generate/10000/_temporary/1/_temporary/attempt_1515660475578_0007_m_000387_0/inventory/data-m-00387" - Aborting...block==null
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1477)
at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)
{code}
This was tracked to
{code}
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
at org.apache.hadoop.hdfs.DataStreamer$StreamerStreams.<init>(DataStreamer.java:162)
at org.apache.hadoop.hdfs.DataStreamer.transfer(DataStreamer.java:1450)
at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1407)
at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1598)
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1499)
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481)
at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)
{code}
{code}
# ss -tl | grep 50010
LISTEN 0 128 *:50010 *:*
{code}
However, the system is configured with a much higher somaxconn
{code}
# sysctl -a | grep somaxconn
net.core.somaxconn = 16000
{code}
Yet, the SNMP counters show connections being refused with {{127 times the listen queue of a socket overflowed}}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org