You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by omprakash <om...@cdac.in> on 2017/07/26 12:46:34 UTC

Lots of Exception for "cannot assign requested address" in datanode logs

Hi all,

 

I am running a 4 node cluster with 2 Master node( NN1, NN2 with HA using
QJM) and 2 Slave nodes(DN1, DN2). I am receiving lots of Exceptions in
Datanode logs as shown below

 

2017-07-26 17:56:00,703 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(192.168.9.132:50010,
datanodeUuid=5a2e6721-3a9a-43f1-94cc-f58f24b5a15b, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-57;cid=CID-7aa9fcd4-36fc-4e7b-87cd-d20594774b85;nsid=1753301
932;c=1500696043365):Failed to transfer
BP-1085904515-192.168.9.116-1500696043365:blk_1078544770_4804082 to
192.168.9.116:50010 got

java.net.BindException: Cannot assign requested address

        at sun.nio.ch.Net.connect0(Native Method)

        at sun.nio.ch.Net.connect(Net.java:465)

        at sun.nio.ch.Net.connect(Net.java:457)

        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)

        at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:1
92)

        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)

        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)

        at
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.ja
va:2312)

        at java.lang.Thread.run(Thread.java:745)

 

 

I have 10 million files in hdfs. All the nodes have same configurations.
Above Exception started occurring when I changed the below parameters in
hdfs-site.xml file. I made these changes to increase replication rate for
under-replicated blocks. 

 

dfs.namenode.handler.count=5000

dfs.namenode.replication.work.multiplier.per.iteration=1000

dfs.namenode.replication.max-streams=2000 --> not documented in
hdfs.site.xml

dfs.namenode.replication.max-streams-hard-limit=4000   ---> not documented
in hdfs.site.xml

 

 

The rate of replication of blocks increased but suddenly the Exception
started to appear. 

 

Can anybody explain this  behavior? 

 

 

Regards

Omprakash Paliwal

 


-------------------------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
-------------------------------------------------------------------------------------------------------------------------------


Re: Lots of Exception for "cannot assign requested address" in datanode logs

Posted by Ravi Prakash <ra...@gmail.com>.
You replication numbers do seem to be on the high. How did you arrive at
those numbers? If you swamp the datanode with too much replication work
than it can do in an iteration (every 3 seconds), things would go bad.

I often check using `ps aux | grep java` all the java processes running
rather than relying on `service status datanode` or other scripts.

On Wed, Jul 26, 2017 at 10:46 PM, omprakash <om...@cdac.in> wrote:

> Hi Ravi,
>
>
>
> The two datanodes are on different Machines. At the time when these error
> were generating I can see that DN1 was replicating under-replicating blocks
> on DN2.
>
>
>
> Can this be related to properties I added for increasing replication rate?
>
>
>
> Regards
>
> Om Prakash
>
>
>
> *From:* Ravi Prakash [mailto:ravihadoop@gmail.com]
> *Sent:* 27 July 2017 01:26
> *To:* omprakash <om...@cdac.in>
> *Cc:* user <us...@hadoop.apache.org>
> *Subject:* Re: Lots of Exception for "cannot assign requested address" in
> datanode logs
>
>
>
> Hi Omprakash!
>
> DatanodeRegistration happens when the Datanode first hearbeats to the
> Namenode. In your case, it seems some other application has acquired the
> port 50010 . You can check this with the command "netstat -anp | grep
> 50010" . Are you trying to run 2 datanode processes on the same machine?
>
> HTH
>
> Ravi
>
>
>
> On Wed, Jul 26, 2017 at 5:46 AM, omprakash <om...@cdac.in> wrote:
>
> Hi all,
>
>
>
> I am running a 4 node cluster with 2 Master node( NN1, NN2 with HA using
> QJM) and 2 Slave nodes(DN1, DN2). I am receiving lots of Exceptions in
> Datanode logs as shown below
>
>
>
> 2017-07-26 17:56:00,703 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.9.132:50010, datanodeUuid=5a2e6721-3a9a-43f1-94cc-f58f24b5a15b,
> infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-57;cid=CID-
> 7aa9fcd4-36fc-4e7b-87cd-d20594774b85;nsid=1753301932;c=1500696043365):Failed
> to transfer BP-1085904515-192.168.9.116-1500696043365:blk_1078544770_4804082
> to 192.168.9.116:50010 got
>
> java.net.BindException: Cannot assign requested address
>
>         at sun.nio.ch.Net.connect0(Native Method)
>
>         at sun.nio.ch.Net.connect(Net.java:465)
>
>         at sun.nio.ch.Net.connect(Net.java:457)
>
>         at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.
> java:670)
>
>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(
> SocketIOWithTimeout.java:192)
>
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
>
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
>
>         at org.apache.hadoop.hdfs.server.datanode.DataNode$
> DataTransfer.run(DataNode.java:2312)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
> I have 10 million files in hdfs. All the nodes have same configurations.
> Above Exception started occurring when I changed the below parameters in
> *hdfs-site.xml* file. I made these changes to increase replication rate
> for under-replicated blocks.
>
>
>
> dfs.namenode.handler.count=5000
>
> dfs.namenode.replication.work.multiplier.per.iteration=1000
>
> dfs.namenode.replication.max-streams=2000 *à** not documented in
> hdfs.site.xml*
>
> dfs.namenode.replication.max-streams-hard-limit=4000   *-**à** not
> documented in hdfs.site.xml*
>
>
>
>
>
> The rate of replication of blocks increased but suddenly the Exception
> started to appear.
>
>
>
> Can anybody explain this  behavior?
>
>
>
>
>
> *Regards*
>
> *Omprakash Paliwal*
>
>
>
>
> ------------------------------------------------------------
> -------------------------------------------------------------------
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------
> -------------------------------------------------------------------
>
>
>
> ------------------------------------------------------------
> -------------------------------------------------------------------
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------
> -------------------------------------------------------------------
>

RE: Lots of Exception for "cannot assign requested address" in datanode logs

Posted by omprakash <om...@cdac.in>.
Hi Ravi,

 

The two datanodes are on different Machines. At the time when these error were generating I can see that DN1 was replicating under-replicating blocks on DN2. 

 

Can this be related to properties I added for increasing replication rate?

 

Regards

Om Prakash

 

From: Ravi Prakash [mailto:ravihadoop@gmail.com] 
Sent: 27 July 2017 01:26
To: omprakash <om...@cdac.in>
Cc: user <us...@hadoop.apache.org>
Subject: Re: Lots of Exception for "cannot assign requested address" in datanode logs

 

Hi Omprakash!

DatanodeRegistration happens when the Datanode first hearbeats to the Namenode. In your case, it seems some other application has acquired the port 50010 . You can check this with the command "netstat -anp | grep 50010" . Are you trying to run 2 datanode processes on the same machine?

HTH

Ravi

 

On Wed, Jul 26, 2017 at 5:46 AM, omprakash <omprakashp@cdac.in <ma...@cdac.in> > wrote:

Hi all,

 

I am running a 4 node cluster with 2 Master node( NN1, NN2 with HA using QJM) and 2 Slave nodes(DN1, DN2). I am receiving lots of Exceptions in Datanode logs as shown below

 

2017-07-26 17:56:00,703 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.9.132:50010 <http://192.168.9.132:50010> , datanodeUuid=5a2e6721-3a9a-43f1-94cc-f58f24b5a15b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-57;cid=CID-7aa9fcd4-36fc-4e7b-87cd-d20594774b85;nsid=1753301932;c=1500696043365):Failed to transfer BP-1085904515-192.168.9.116-1500696043365:blk_1078544770_4804082 to 192.168.9.116:50010 <http://192.168.9.116:50010>  got

java.net.BindException: Cannot assign requested address

        at sun.nio.ch.Net.connect0(Native Method)

        at sun.nio.ch.Net.connect(Net.java:465)

        at sun.nio.ch.Net.connect(Net.java:457)

        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)

        at org.apache.hadoop.net <http://org.apache.hadoop.net> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)

        at org.apache.hadoop.net <http://org.apache.hadoop.net> .NetUtils.connect(NetUtils.java:531)

        at org.apache.hadoop.net <http://org.apache.hadoop.net> .NetUtils.connect(NetUtils.java:495)

        at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2312)

        at java.lang.Thread.run(Thread.java:745)

 

 

I have 10 million files in hdfs. All the nodes have same configurations. Above Exception started occurring when I changed the below parameters in hdfs-site.xml file. I made these changes to increase replication rate for under-replicated blocks. 

 

dfs.namenode.handler.count=5000

dfs.namenode.replication.work <http://dfs.namenode.replication.work> .multiplier.per.iteration=1000

dfs.namenode.replication.max-streams=2000 --> not documented in hdfs.site.xml

dfs.namenode.replication.max-streams-hard-limit=4000   ---> not documented in hdfs.site.xml

 

 

The rate of replication of blocks increased but suddenly the Exception started to appear. 

 

Can anybody explain this  behavior? 

 

 

Regards

Omprakash Paliwal

 


------------------------------------------------------------------------------------------------------------------------------- 
[ C-DAC is on Social-Media too. Kindly follow us at: 
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] 

This e-mail is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. If you are not the 
intended recipient, please contact the sender by reply e-mail and destroy 
all copies and the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email 
is strictly prohibited and appropriate legal action will be taken. 
------------------------------------------------------------------------------------------------------------------------------- 

 


-------------------------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
-------------------------------------------------------------------------------------------------------------------------------


Re: Lots of Exception for "cannot assign requested address" in datanode logs

Posted by Ravi Prakash <ra...@gmail.com>.
Hi Omprakash!

DatanodeRegistration happens when the Datanode first hearbeats to the
Namenode. In your case, it seems some other application has acquired the
port 50010 . You can check this with the command "netstat -anp | grep
50010" . Are you trying to run 2 datanode processes on the same machine?

HTH
Ravi

On Wed, Jul 26, 2017 at 5:46 AM, omprakash <om...@cdac.in> wrote:

> Hi all,
>
>
>
> I am running a 4 node cluster with 2 Master node( NN1, NN2 with HA using
> QJM) and 2 Slave nodes(DN1, DN2). I am receiving lots of Exceptions in
> Datanode logs as shown below
>
>
>
> 2017-07-26 17:56:00,703 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.9.132:50010, datanodeUuid=5a2e6721-3a9a-43f1-94cc-f58f24b5a15b,
> infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-57;cid=CID-
> 7aa9fcd4-36fc-4e7b-87cd-d20594774b85;nsid=1753301932;c=1500696043365):Failed
> to transfer BP-1085904515-192.168.9.116-1500696043365:blk_1078544770_4804082
> to 192.168.9.116:50010 got
>
> java.net.BindException: Cannot assign requested address
>
>         at sun.nio.ch.Net.connect0(Native Method)
>
>         at sun.nio.ch.Net.connect(Net.java:465)
>
>         at sun.nio.ch.Net.connect(Net.java:457)
>
>         at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.
> java:670)
>
>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(
> SocketIOWithTimeout.java:192)
>
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
>
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
>
>         at org.apache.hadoop.hdfs.server.datanode.DataNode$
> DataTransfer.run(DataNode.java:2312)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
> I have 10 million files in hdfs. All the nodes have same configurations.
> Above Exception started occurring when I changed the below parameters in
> *hdfs-site.xml* file. I made these changes to increase replication rate
> for under-replicated blocks.
>
>
>
> dfs.namenode.handler.count=5000
>
> dfs.namenode.replication.work.multiplier.per.iteration=1000
>
> dfs.namenode.replication.max-streams=2000 *à** not documented in
> hdfs.site.xml*
>
> dfs.namenode.replication.max-streams-hard-limit=4000   *-**à** not
> documented in hdfs.site.xml*
>
>
>
>
>
> The rate of replication of blocks increased but suddenly the Exception
> started to appear.
>
>
>
> Can anybody explain this  behavior?
>
>
>
>
>
> *Regards*
>
> *Omprakash Paliwal*
>
>
>
> ------------------------------------------------------------
> -------------------------------------------------------------------
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------
> -------------------------------------------------------------------
>