You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Krishna Rao <kr...@gmail.com> on 2015/02/26 17:18:22 UTC

Intermittent BindException during long MR jobs

Hi,

we occasionally run into a BindException causing long running jobs to
occasionally fail.

The stacktrace is below.

Any ideas what this could be caused by?

Cheers,

Krishna


Stacktrace:
379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
Submission failed with exception 'java.net.BindException(Problem binding to
[back10/10.4.2.10:0] java.net.BindException: Cann
ot assign requested address; For more details see:
http://wiki.apache.org/hadoop/BindException)'
java.net.BindException: Problem binding to [back10/10.4.2.10:0]
java.net.BindException: Cannot assign requested address; For more details
see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
        at org.apache.hadoop.ipc.Client.call(Client.java:1242)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy10.create(Unknown Source)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at com.sun.proxy.$Proxy11.create(Unknown Source)
        at
org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
        at
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
        at
org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
        at
org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
        at
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
        at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
        at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
        at
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
        at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)

Re: Intermittent BindException during long MR jobs

Posted by daemeon reiydelle <da...@gmail.com>.
When the access fails, do you have a way to check that the utilization on
the target node ... i.e. was the target node utilization at 100%?



On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by daemeon reiydelle <da...@gmail.com>.
When the access fails, do you have a way to check that the utilization on
the target node ... i.e. was the target node utilization at 100%?



On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@gmail.com>.
Thanks for the responses. In our case the port is 0, and so from the link
<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a
collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the
port-in-use and port-below-1024 problems are highly unlikely to be the
cause of the problem."

I think load may be the culprit since the nodes will be heavily used during
the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection
attempt? In all cases so far it seems to be on a call to delete a file in
HDFS. I had a search through the HDFS code base but couldn't see an obvious
way to set a timeout, and couldn't see it being set.

Krishna

On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com> wrote:

> Krishna:
> Please take a look at:
> http://wiki.apache.org/hadoop/BindException
>
> Cheers
>
> On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:
>
>> Hello Krishna,
>>
>>
>>
>> Exception seems to be IP specific. It might be occurred due to
>> unavailability of IP address in the system to assign. Double check the IP
>> address availability and run the job.
>>
>>
>>
>> *Thanks,*
>>
>> *S.RagavendraGanesh*
>>
>> ViSolve Hadoop Support Team
>> ViSolve Inc. | San Jose, California
>> Website: www.visolve.com
>>
>> email: services@visolve.com | Phone: 408-850-2243
>>
>>
>>
>>
>>
>> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
>> *Sent:* Thursday, February 26, 2015 9:48 PM
>> *To:* user@hive.apache.org; user@hadoop.apache.org
>> *Subject:* Intermittent BindException during long MR jobs
>>
>>
>>
>> Hi,
>>
>>
>>
>> we occasionally run into a BindException causing long running jobs to
>> occasionally fail.
>>
>>
>>
>> The stacktrace is below.
>>
>>
>>
>> Any ideas what this could be caused by?
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Krishna
>>
>>
>>
>>
>>
>> Stacktrace:
>>
>> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
>> Submission failed with exception 'java.net.BindException(Problem binding to
>> [back10/10.4.2.10:0] java.net.BindException: Cann
>>
>> ot assign requested address; For more details see:
>> http://wiki.apache.org/hadoop/BindException)'
>>
>> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
>> java.net.BindException: Cannot assign requested address; For more details
>> see:  http://wiki.apache.org/hadoop/BindException
>>
>>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>>
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>
>>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>>
>>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>>
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>>
>>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>>
>>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>>
>>
>>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@gmail.com>.
Thanks for the responses. In our case the port is 0, and so from the link
<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a
collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the
port-in-use and port-below-1024 problems are highly unlikely to be the
cause of the problem."

I think load may be the culprit since the nodes will be heavily used during
the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection
attempt? In all cases so far it seems to be on a call to delete a file in
HDFS. I had a search through the HDFS code base but couldn't see an obvious
way to set a timeout, and couldn't see it being set.

Krishna

On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com> wrote:

> Krishna:
> Please take a look at:
> http://wiki.apache.org/hadoop/BindException
>
> Cheers
>
> On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:
>
>> Hello Krishna,
>>
>>
>>
>> Exception seems to be IP specific. It might be occurred due to
>> unavailability of IP address in the system to assign. Double check the IP
>> address availability and run the job.
>>
>>
>>
>> *Thanks,*
>>
>> *S.RagavendraGanesh*
>>
>> ViSolve Hadoop Support Team
>> ViSolve Inc. | San Jose, California
>> Website: www.visolve.com
>>
>> email: services@visolve.com | Phone: 408-850-2243
>>
>>
>>
>>
>>
>> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
>> *Sent:* Thursday, February 26, 2015 9:48 PM
>> *To:* user@hive.apache.org; user@hadoop.apache.org
>> *Subject:* Intermittent BindException during long MR jobs
>>
>>
>>
>> Hi,
>>
>>
>>
>> we occasionally run into a BindException causing long running jobs to
>> occasionally fail.
>>
>>
>>
>> The stacktrace is below.
>>
>>
>>
>> Any ideas what this could be caused by?
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Krishna
>>
>>
>>
>>
>>
>> Stacktrace:
>>
>> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
>> Submission failed with exception 'java.net.BindException(Problem binding to
>> [back10/10.4.2.10:0] java.net.BindException: Cann
>>
>> ot assign requested address; For more details see:
>> http://wiki.apache.org/hadoop/BindException)'
>>
>> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
>> java.net.BindException: Cannot assign requested address; For more details
>> see:  http://wiki.apache.org/hadoop/BindException
>>
>>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>>
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>
>>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>>
>>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>>
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>>
>>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>>
>>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>>
>>
>>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@blinkbox.com>.
Thanks for the responses. In our case the port is 0, and so from the link<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem."

I think load may be the culprit since the nodes will be heavily used during the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection attempt? In all cases so far it seems to be on a call to delete a file in HDFS. I had a search through the HDFS code base but couldn't see an obvious way to set a timeout, and couldn't see it being set.


Krishna


On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com>> wrote:
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com>> wrote:
Hello Krishna,

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job.

Thanks,
S.RagavendraGanesh
ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com<http://www.visolve.com>
email: services@visolve.com<ma...@visolve.com> | Phone: 408-850-2243<tel:408-850-2243>


From: Krishna Rao [mailto:krishnanjrao@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org<ma...@hive.apache.org>; user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent BindException during long MR jobs

Hi,

we occasionally run into a BindException causing long running jobs to occasionally fail.

The stacktrace is below.

Any ideas what this could be caused by?

Cheers,

Krishna


Stacktrace:
379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cann
ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'
java.net.BindException: Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
        at org.apache.hadoop.ipc.Client.call(Client.java:1242)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy10.create(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at com.sun.proxy.$Proxy11.create(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)





Krishna Rao
Senior Development Engineer Lead
t: +44 (0)1865 747960
m:
blinkbox music - the easiest way to listen to the music you love, for free
www.blinkboxmusic.com


Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@blinkbox.com>.
Thanks for the responses. In our case the port is 0, and so from the link<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem."

I think load may be the culprit since the nodes will be heavily used during the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection attempt? In all cases so far it seems to be on a call to delete a file in HDFS. I had a search through the HDFS code base but couldn't see an obvious way to set a timeout, and couldn't see it being set.


Krishna


On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com>> wrote:
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com>> wrote:
Hello Krishna,

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job.

Thanks,
S.RagavendraGanesh
ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com<http://www.visolve.com>
email: services@visolve.com<ma...@visolve.com> | Phone: 408-850-2243<tel:408-850-2243>


From: Krishna Rao [mailto:krishnanjrao@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org<ma...@hive.apache.org>; user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent BindException during long MR jobs

Hi,

we occasionally run into a BindException causing long running jobs to occasionally fail.

The stacktrace is below.

Any ideas what this could be caused by?

Cheers,

Krishna


Stacktrace:
379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cann
ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'
java.net.BindException: Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
        at org.apache.hadoop.ipc.Client.call(Client.java:1242)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy10.create(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at com.sun.proxy.$Proxy11.create(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)





Krishna Rao
Senior Development Engineer Lead
t: +44 (0)1865 747960
m:
blinkbox music - the easiest way to listen to the music you love, for free
www.blinkboxmusic.com


Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@blinkbox.com>.
Thanks for the responses. In our case the port is 0, and so from the link<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem."

I think load may be the culprit since the nodes will be heavily used during the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection attempt? In all cases so far it seems to be on a call to delete a file in HDFS. I had a search through the HDFS code base but couldn't see an obvious way to set a timeout, and couldn't see it being set.


Krishna


On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com>> wrote:
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com>> wrote:
Hello Krishna,

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job.

Thanks,
S.RagavendraGanesh
ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com<http://www.visolve.com>
email: services@visolve.com<ma...@visolve.com> | Phone: 408-850-2243<tel:408-850-2243>


From: Krishna Rao [mailto:krishnanjrao@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org<ma...@hive.apache.org>; user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent BindException during long MR jobs

Hi,

we occasionally run into a BindException causing long running jobs to occasionally fail.

The stacktrace is below.

Any ideas what this could be caused by?

Cheers,

Krishna


Stacktrace:
379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cann
ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'
java.net.BindException: Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
        at org.apache.hadoop.ipc.Client.call(Client.java:1242)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy10.create(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at com.sun.proxy.$Proxy11.create(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)





Krishna Rao
Senior Development Engineer Lead
t: +44 (0)1865 747960
m:
blinkbox music - the easiest way to listen to the music you love, for free
www.blinkboxmusic.com


Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@blinkbox.com>.
Thanks for the responses. In our case the port is 0, and so from the link<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem."

I think load may be the culprit since the nodes will be heavily used during the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection attempt? In all cases so far it seems to be on a call to delete a file in HDFS. I had a search through the HDFS code base but couldn't see an obvious way to set a timeout, and couldn't see it being set.


Krishna


On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com>> wrote:
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com>> wrote:
Hello Krishna,

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job.

Thanks,
S.RagavendraGanesh
ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com<http://www.visolve.com>
email: services@visolve.com<ma...@visolve.com> | Phone: 408-850-2243<tel:408-850-2243>


From: Krishna Rao [mailto:krishnanjrao@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org<ma...@hive.apache.org>; user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent BindException during long MR jobs

Hi,

we occasionally run into a BindException causing long running jobs to occasionally fail.

The stacktrace is below.

Any ideas what this could be caused by?

Cheers,

Krishna


Stacktrace:
379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cann
ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'
java.net.BindException: Problem binding to [back10/10.4.2.10:0<http://10.4.2.10:0>] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
        at org.apache.hadoop.ipc.Client.call(Client.java:1242)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy10.create(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at com.sun.proxy.$Proxy11.create(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)





Krishna Rao
Senior Development Engineer Lead
t: +44 (0)1865 747960
m:
blinkbox music - the easiest way to listen to the music you love, for free
www.blinkboxmusic.com


Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@gmail.com>.
Thanks for the responses. In our case the port is 0, and so from the link
<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a
collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the
port-in-use and port-below-1024 problems are highly unlikely to be the
cause of the problem."

I think load may be the culprit since the nodes will be heavily used during
the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection
attempt? In all cases so far it seems to be on a call to delete a file in
HDFS. I had a search through the HDFS code base but couldn't see an obvious
way to set a timeout, and couldn't see it being set.

Krishna

On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com> wrote:

> Krishna:
> Please take a look at:
> http://wiki.apache.org/hadoop/BindException
>
> Cheers
>
> On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:
>
>> Hello Krishna,
>>
>>
>>
>> Exception seems to be IP specific. It might be occurred due to
>> unavailability of IP address in the system to assign. Double check the IP
>> address availability and run the job.
>>
>>
>>
>> *Thanks,*
>>
>> *S.RagavendraGanesh*
>>
>> ViSolve Hadoop Support Team
>> ViSolve Inc. | San Jose, California
>> Website: www.visolve.com
>>
>> email: services@visolve.com | Phone: 408-850-2243
>>
>>
>>
>>
>>
>> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
>> *Sent:* Thursday, February 26, 2015 9:48 PM
>> *To:* user@hive.apache.org; user@hadoop.apache.org
>> *Subject:* Intermittent BindException during long MR jobs
>>
>>
>>
>> Hi,
>>
>>
>>
>> we occasionally run into a BindException causing long running jobs to
>> occasionally fail.
>>
>>
>>
>> The stacktrace is below.
>>
>>
>>
>> Any ideas what this could be caused by?
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Krishna
>>
>>
>>
>>
>>
>> Stacktrace:
>>
>> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
>> Submission failed with exception 'java.net.BindException(Problem binding to
>> [back10/10.4.2.10:0] java.net.BindException: Cann
>>
>> ot assign requested address; For more details see:
>> http://wiki.apache.org/hadoop/BindException)'
>>
>> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
>> java.net.BindException: Cannot assign requested address; For more details
>> see:  http://wiki.apache.org/hadoop/BindException
>>
>>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>>
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>
>>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>>
>>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>>
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>>
>>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>>
>>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>>
>>
>>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@gmail.com>.
Thanks for the responses. In our case the port is 0, and so from the link
<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a
collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the
port-in-use and port-below-1024 problems are highly unlikely to be the
cause of the problem."

I think load may be the culprit since the nodes will be heavily used during
the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection
attempt? In all cases so far it seems to be on a call to delete a file in
HDFS. I had a search through the HDFS code base but couldn't see an obvious
way to set a timeout, and couldn't see it being set.

Krishna

On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com> wrote:

> Krishna:
> Please take a look at:
> http://wiki.apache.org/hadoop/BindException
>
> Cheers
>
> On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:
>
>> Hello Krishna,
>>
>>
>>
>> Exception seems to be IP specific. It might be occurred due to
>> unavailability of IP address in the system to assign. Double check the IP
>> address availability and run the job.
>>
>>
>>
>> *Thanks,*
>>
>> *S.RagavendraGanesh*
>>
>> ViSolve Hadoop Support Team
>> ViSolve Inc. | San Jose, California
>> Website: www.visolve.com
>>
>> email: services@visolve.com | Phone: 408-850-2243
>>
>>
>>
>>
>>
>> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
>> *Sent:* Thursday, February 26, 2015 9:48 PM
>> *To:* user@hive.apache.org; user@hadoop.apache.org
>> *Subject:* Intermittent BindException during long MR jobs
>>
>>
>>
>> Hi,
>>
>>
>>
>> we occasionally run into a BindException causing long running jobs to
>> occasionally fail.
>>
>>
>>
>> The stacktrace is below.
>>
>>
>>
>> Any ideas what this could be caused by?
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Krishna
>>
>>
>>
>>
>>
>> Stacktrace:
>>
>> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
>> Submission failed with exception 'java.net.BindException(Problem binding to
>> [back10/10.4.2.10:0] java.net.BindException: Cann
>>
>> ot assign requested address; For more details see:
>> http://wiki.apache.org/hadoop/BindException)'
>>
>> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
>> java.net.BindException: Cannot assign requested address; For more details
>> see:  http://wiki.apache.org/hadoop/BindException
>>
>>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>>
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>
>>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>>
>>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>>
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>>
>>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>>
>>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>>
>>
>>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Krishna Rao <kr...@gmail.com>.
Thanks for the responses. In our case the port is 0, and so from the link
<http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a
collision is highly unlikely:

"If the port is "0", then the OS is looking for any free port -so the
port-in-use and port-below-1024 problems are highly unlikely to be the
cause of the problem."

I think load may be the culprit since the nodes will be heavily used during
the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection
attempt? In all cases so far it seems to be on a call to delete a file in
HDFS. I had a search through the HDFS code base but couldn't see an obvious
way to set a timeout, and couldn't see it being set.

Krishna

On 28 February 2015 at 15:20, Ted Yu <yu...@gmail.com> wrote:

> Krishna:
> Please take a look at:
> http://wiki.apache.org/hadoop/BindException
>
> Cheers
>
> On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:
>
>> Hello Krishna,
>>
>>
>>
>> Exception seems to be IP specific. It might be occurred due to
>> unavailability of IP address in the system to assign. Double check the IP
>> address availability and run the job.
>>
>>
>>
>> *Thanks,*
>>
>> *S.RagavendraGanesh*
>>
>> ViSolve Hadoop Support Team
>> ViSolve Inc. | San Jose, California
>> Website: www.visolve.com
>>
>> email: services@visolve.com | Phone: 408-850-2243
>>
>>
>>
>>
>>
>> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
>> *Sent:* Thursday, February 26, 2015 9:48 PM
>> *To:* user@hive.apache.org; user@hadoop.apache.org
>> *Subject:* Intermittent BindException during long MR jobs
>>
>>
>>
>> Hi,
>>
>>
>>
>> we occasionally run into a BindException causing long running jobs to
>> occasionally fail.
>>
>>
>>
>> The stacktrace is below.
>>
>>
>>
>> Any ideas what this could be caused by?
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Krishna
>>
>>
>>
>>
>>
>> Stacktrace:
>>
>> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
>> Submission failed with exception 'java.net.BindException(Problem binding to
>> [back10/10.4.2.10:0] java.net.BindException: Cann
>>
>> ot assign requested address; For more details see:
>> http://wiki.apache.org/hadoop/BindException)'
>>
>> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
>> java.net.BindException: Cannot assign requested address; For more details
>> see:  http://wiki.apache.org/hadoop/BindException
>>
>>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>>
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>
>>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>>
>>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>>
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>>
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>>
>>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>>
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>>
>>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>>
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>>
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>>
>>         at
>> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>>
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>>
>>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>>
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>>
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>>
>>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>>
>>
>>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Ted Yu <yu...@gmail.com>.
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Ted Yu <yu...@gmail.com>.
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Ted Yu <yu...@gmail.com>.
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Ted Yu <yu...@gmail.com>.
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by Ted Yu <yu...@gmail.com>.
Krishna:
Please take a look at:
http://wiki.apache.org/hadoop/BindException

Cheers

On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by daemeon reiydelle <da...@gmail.com>.
When the access fails, do you have a way to check that the utilization on
the target node ... i.e. was the target node utilization at 100%?



On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

Re: Intermittent BindException during long MR jobs

Posted by daemeon reiydelle <da...@gmail.com>.
When the access fails, do you have a way to check that the utilization on
the target node ... i.e. was the target node utilization at 100%?



On Thu, Feb 26, 2015 at 10:30 PM, <ha...@visolve.com> wrote:

> Hello Krishna,
>
>
>
> Exception seems to be IP specific. It might be occurred due to
> unavailability of IP address in the system to assign. Double check the IP
> address availability and run the job.
>
>
>
> *Thanks,*
>
> *S.RagavendraGanesh*
>
> ViSolve Hadoop Support Team
> ViSolve Inc. | San Jose, California
> Website: www.visolve.com
>
> email: services@visolve.com | Phone: 408-850-2243
>
>
>
>
>
> *From:* Krishna Rao [mailto:krishnanjrao@gmail.com]
> *Sent:* Thursday, February 26, 2015 9:48 PM
> *To:* user@hive.apache.org; user@hadoop.apache.org
> *Subject:* Intermittent BindException during long MR jobs
>
>
>
> Hi,
>
>
>
> we occasionally run into a BindException causing long running jobs to
> occasionally fail.
>
>
>
> The stacktrace is below.
>
>
>
> Any ideas what this could be caused by?
>
>
>
> Cheers,
>
>
>
> Krishna
>
>
>
>
>
> Stacktrace:
>
> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
> Submission failed with exception 'java.net.BindException(Problem binding to
> [back10/10.4.2.10:0] java.net.BindException: Cann
>
> ot assign requested address; For more details see:
> http://wiki.apache.org/hadoop/BindException)'
>
> java.net.BindException: Problem binding to [back10/10.4.2.10:0]
> java.net.BindException: Cannot assign requested address; For more details
> see:  http://wiki.apache.org/hadoop/BindException
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1242)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>
>         at com.sun.proxy.$Proxy10.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)
>
>         at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>
>         at com.sun.proxy.$Proxy11.create(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)
>
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)
>
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)
>
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)
>
>         at
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)
>
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
>
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
>
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
>
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
>
>         at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
>
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
>
>
>

RE: Intermittent BindException during long MR jobs

Posted by ha...@visolve.com.
Hello Krishna,

 

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job. 

 

Thanks,

S.RagavendraGanesh

ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com <http://www.visolve.com>  

email: services@visolve.com <ma...@visolve.com>  | Phone: 408-850-2243

 

 

From: Krishna Rao [mailto:krishnanjrao@gmail.com] 
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org; user@hadoop.apache.org
Subject: Intermittent BindException during long MR jobs

 

Hi,

 

we occasionally run into a BindException causing long running jobs to occasionally fail.

 

The stacktrace is below.

 

Any ideas what this could be caused by?

 

Cheers,

 

Krishna

 

 

Stacktrace:

379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cann

ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'

java.net.BindException: Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException

        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)

        at org.apache.hadoop.ipc.Client.call(Client.java:1242)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

        at com.sun.proxy.$Proxy10.create(Unknown Source)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)

        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)

        at com.sun.proxy.$Proxy11.create(Unknown Source)

        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)

        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)

        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)

        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)

        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)

        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)

        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)

 


RE: Intermittent BindException during long MR jobs

Posted by ha...@visolve.com.
Hello Krishna,

 

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job. 

 

Thanks,

S.RagavendraGanesh

ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com <http://www.visolve.com>  

email: services@visolve.com <ma...@visolve.com>  | Phone: 408-850-2243

 

 

From: Krishna Rao [mailto:krishnanjrao@gmail.com] 
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org; user@hadoop.apache.org
Subject: Intermittent BindException during long MR jobs

 

Hi,

 

we occasionally run into a BindException causing long running jobs to occasionally fail.

 

The stacktrace is below.

 

Any ideas what this could be caused by?

 

Cheers,

 

Krishna

 

 

Stacktrace:

379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cann

ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'

java.net.BindException: Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException

        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)

        at org.apache.hadoop.ipc.Client.call(Client.java:1242)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

        at com.sun.proxy.$Proxy10.create(Unknown Source)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)

        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)

        at com.sun.proxy.$Proxy11.create(Unknown Source)

        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)

        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)

        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)

        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)

        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)

        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)

        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)

 


RE: Intermittent BindException during long MR jobs

Posted by ha...@visolve.com.
Hello Krishna,

 

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job. 

 

Thanks,

S.RagavendraGanesh

ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com <http://www.visolve.com>  

email: services@visolve.com <ma...@visolve.com>  | Phone: 408-850-2243

 

 

From: Krishna Rao [mailto:krishnanjrao@gmail.com] 
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org; user@hadoop.apache.org
Subject: Intermittent BindException during long MR jobs

 

Hi,

 

we occasionally run into a BindException causing long running jobs to occasionally fail.

 

The stacktrace is below.

 

Any ideas what this could be caused by?

 

Cheers,

 

Krishna

 

 

Stacktrace:

379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cann

ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'

java.net.BindException: Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException

        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)

        at org.apache.hadoop.ipc.Client.call(Client.java:1242)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

        at com.sun.proxy.$Proxy10.create(Unknown Source)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)

        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)

        at com.sun.proxy.$Proxy11.create(Unknown Source)

        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)

        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)

        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)

        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)

        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)

        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)

        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)

 


RE: Intermittent BindException during long MR jobs

Posted by ha...@visolve.com.
Hello Krishna,

 

Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job. 

 

Thanks,

S.RagavendraGanesh

ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Website: www.visolve.com <http://www.visolve.com>  

email: services@visolve.com <ma...@visolve.com>  | Phone: 408-850-2243

 

 

From: Krishna Rao [mailto:krishnanjrao@gmail.com] 
Sent: Thursday, February 26, 2015 9:48 PM
To: user@hive.apache.org; user@hadoop.apache.org
Subject: Intermittent BindException during long MR jobs

 

Hi,

 

we occasionally run into a BindException causing long running jobs to occasionally fail.

 

The stacktrace is below.

 

Any ideas what this could be caused by?

 

Cheers,

 

Krishna

 

 

Stacktrace:

379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cann

ot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException)'

java.net.BindException: Problem binding to [back10/10.4.2.10:0 <http://10.4.2.10:0> ] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException

        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)

        at org.apache.hadoop.ipc.Client.call(Client.java:1242)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

        at com.sun.proxy.$Proxy10.create(Unknown Source)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)

        at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)

        at com.sun.proxy.$Proxy11.create(Unknown Source)

        at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376)

        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)

        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)

        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)

        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)

        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)

        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)

        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)

        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)

        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)

        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)