You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Jianhui Zhang <jh...@gmail.com> on 2012/10/08 23:58:33 UTC

NN uses up all ports (java.net.BindException: Cannot assign requested address)

version: hadoop-0.20.205.0

We've seen this the second time. The JT had this error:

2012-10-08 11:44:03,928 WARN org.apache.hadoop.hdfs.DFSClient: Problem
renewing lease for DFSClient_1416124356
java.io.IOException: Call to nn-virtual.x.y.z/1.2.3.4:8020 failed on local
exception: java.net.BindException: Cannot assign requested address
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
        at org.apache.hadoop.ipc.Client.call(Client.java:1071)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy5.renewLease(Unknown Source)

and it has been trying to reconnect to the NN:

2012-10-08 11:44:03,927 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: nn-virtual.x.y.z/1.2.3.4:8020. Already tried 9 time(s).

The problem persisted. Some of our MR jobs were successful, some failed
with the error above....

So I listed out all the local addresses and got about 24K of them. The
ip_local_port_range has:

32768 61000

We are not reaching the limit, but very close. What's strange is: almost
all of the local ports are used by the NN process. There might be some
holes in the list, but overall, it seems using up all the ephemeral ports
available in the range.

Right now, I strongly suspect that "Cannot assign requested address" is due
to lack of ports - although I'm not 100% sure since the ephemeral ports
change all the time.

Here are my questions:

1. Has anybody seen this before?  Any pointers would be appreciated.

2. We are using a virtual IP for the NN. Could it be related to the
problem?

Thanks for your help,
James

NN uses up all ports (java.net.BindException: Cannot assign requested address)

Posted by Jianhui Zhang <jh...@gmail.com>.
version: hadoop-0.20.205.0

We've seen this the second time. The JT had this error:

2012-10-08 11:44:03,928 WARN org.apache.hadoop.hdfs.DFSClient: Problem
renewing lease for DFSClient_1416124356
java.io.IOException: Call to nn-virtual.x.y.z/1.2.3.4:8020 failed on local
exception: java.net.BindException: Cannot assign requested address
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
        at org.apache.hadoop.ipc.Client.call(Client.java:1071)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy5.renewLease(Unknown Source)

and it has been trying to reconnect to the NN:

2012-10-08 11:44:03,927 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: nn-virtual.x.y.z/1.2.3.4:8020. Already tried 9 time(s).

The problem persisted. Some of our MR jobs were successful, some failed
with the error above....

So I listed out all the local addresses and got about 24K of them. The
ip_local_port_range has:

32768 61000

We are not reaching the limit, but very close. What's strange is: almost
all of the local ports are used by the NN process. There might be some
holes in the list, but overall, it seems using up all the ephemeral ports
available in the range.

Right now, I strongly suspect that "Cannot assign requested address" is due
to lack of ports - although I'm not 100% sure since the ephemeral ports
change all the time.

Here are my questions:

1. Has anybody seen this before?  Any pointers would be appreciated.

2. We are using a virtual IP for the NN. Could it be related to the
problem?

Thanks for your help,
James