You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Narlin M <hp...@gmail.com> on 2013/08/31 04:52:04 UTC

UnknownHostException while submitting job to remote cluster

Hi,

 

I am getting following exception while trying to submit a crunch job to a
remote hadoop cluster:

 

2880 [Thread-15] INFO
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev

            at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:
414)

            at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java
:164)

            at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)

            at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)

            at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)

            at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSyste
m.java:124)

            at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)

            at
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)

            at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)

            at
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)

            at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)

            at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)

            at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFi
les.java:103)

            at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)

            at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:396)

            at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1332)

            at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)

            at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)

            at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit
(CrunchControlledJob.java:305)

            at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startRead
yJobs(CrunchJobControl.java:180)

            at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobSt
atusAndStartNewOnes(CrunchJobControl.java:209)

            at
org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)

            at
org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)

            at
org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)

            at java.lang.Thread.run(Thread.java:680)

Caused by: java.net.UnknownHostException: bdatadev

            ... 27 more

 

However nowhere in my code a host named "bdatadev" is mentioned and also, I
cannot ping this host.

 

The section of the code where I am setting up the MRPipeline is as follows:

 

Configuration conf = HBaseConfiguration.create();

 

conf.set("fs.defaultFS", "hdfs://<server_address>:8020");

conf.set("mapred.job.tracker", "<server_address>:8021");

 

System.out.println("Hadoop configuration created.");

System.out.println("Initializing crunch pipeline ...");

 

conf.set("mapred.jar", "<path_to_jar_file>");

 

pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);

 

Has anyone faced this issue before and knows how to resolve it/point out if
I am missing anything?

 

Thanks,

Narlin.


Re: UnknownHostException while submitting job to remote cluster

Posted by Narlin M <hp...@gmail.com>.
Hello Micah, thanks for replying.

I am not sure but I am probably targeting YARN, because I got this warning
to use fs.defaultFS instead of fs.default.name when I ran my application.
But I will try to confirm this.


On Sat, Aug 31, 2013 at 11:48 AM, Micah Whitacre <mk...@gmail.com>wrote:

> It sounds like you are reading configuration files that are setup for HDFS
> HA.  This done by the HBaseConfiguration.create() reading files such as
> hbase-site.xml, core-site.xml, or hdfs-site.xml.
>
> Are you targeting YARN or MRv1?  If MRv1, according to the
> documentation[1] you should be setting "fs.default.name" instead.
>
> [1] -
> http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-High-Availability-Guide/cdh4hag_topic_2_3.html
>
>
> On Fri, Aug 30, 2013 at 9:52 PM, Narlin M <hp...@gmail.com> wrote:
>
>> Hi,****
>>
>> ** **
>>
>> I am getting following exception while trying to submit a crunch job to a
>> remote hadoop cluster:****
>>
>> ** **
>>
>> 2880 [Thread-15] INFO
>>  org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
>> java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
>> ****
>>
>>             at
>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
>> ****
>>
>>             at
>> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
>> ****
>>
>>             at
>> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
>> ****
>>
>>             at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>> ****
>>
>>             at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>> ****
>>
>>             at
>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
>> ****
>>
>>             at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)***
>> *
>>
>>             at
>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)****
>>
>>             at
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)**
>> **
>>
>>             at
>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)****
>>
>>             at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)**
>> **
>>
>>             at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)****
>>
>>             at
>> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
>> ****
>>
>>             at
>> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)****
>>
>>             at
>> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)****
>>
>>             at java.security.AccessController.doPrivileged(Native Method)
>> ****
>>
>>             at javax.security.auth.Subject.doAs(Subject.java:396)****
>>
>>             at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>> ****
>>
>>             at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)*
>> ***
>>
>>             at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)****
>>
>>             at
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
>> ****
>>
>>             at
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
>> ****
>>
>>             at
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
>> ****
>>
>>             at
>> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
>> ****
>>
>>             at
>> org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)*
>> ***
>>
>>             at
>> org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)****
>>
>>             at java.lang.Thread.run(Thread.java:680)****
>>
>> Caused by: java.net.UnknownHostException: bdatadev****
>>
>>             ... 27 more****
>>
>> ** **
>>
>> However nowhere in my code a host named "bdatadev" is mentioned and also,
>> I cannot ping this host.****
>>
>> ** **
>>
>> The section of the code where I am setting up the MRPipeline is as
>> follows:****
>>
>> ** **
>>
>> Configuration conf = HBaseConfiguration.create();****
>>
>> ** **
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");****
>>
>> conf.set("mapred.job.tracker", "<server_address>:8021");****
>>
>> ** **
>>
>> System.out.println("Hadoop configuration created.");****
>>
>> System.out.println("Initializing crunch pipeline ...");****
>>
>> ** **
>>
>> conf.set("mapred.jar", "<path_to_jar_file>");****
>>
>> ** **
>>
>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);****
>>
>> ** **
>>
>> Has anyone faced this issue before and knows how to resolve it/point out
>> if I am missing anything?****
>>
>> ** **
>>
>> Thanks,****
>>
>> Narlin.****
>>
>
>

Re: UnknownHostException while submitting job to remote cluster

Posted by Micah Whitacre <mk...@gmail.com>.
It sounds like you are reading configuration files that are setup for HDFS
HA.  This done by the HBaseConfiguration.create() reading files such as
hbase-site.xml, core-site.xml, or hdfs-site.xml.

Are you targeting YARN or MRv1?  If MRv1, according to the documentation[1]
you should be setting "fs.default.name" instead.

[1] -
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-High-Availability-Guide/cdh4hag_topic_2_3.html


On Fri, Aug 30, 2013 at 9:52 PM, Narlin M <hp...@gmail.com> wrote:

> Hi,****
>
> ** **
>
> I am getting following exception while trying to submit a crunch job to a
> remote hadoop cluster:****
>
> ** **
>
> 2880 [Thread-15] INFO
>  org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
> java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
> ****
>
>             at
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
> ****
>
>             at
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
> ****
>
>             at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
> ****
>
>             at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
> ****
>
>             at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
> ****
>
>             at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
> ****
>
>             at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)****
>
>             at
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)****
>
>             at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)***
> *
>
>             at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)****
>
>             at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)***
> *
>
>             at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)****
>
>             at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
> ****
>
>             at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
> ****
>
>             at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
> ****
>
>             at java.security.AccessController.doPrivileged(Native Method)*
> ***
>
>             at javax.security.auth.Subject.doAs(Subject.java:396)****
>
>             at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> ****
>
>             at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)**
> **
>
>             at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)****
>
>             at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
> ****
>
>             at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
> ****
>
>             at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
> ****
>
>             at
> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
> ****
>
>             at
> org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)**
> **
>
>             at
> org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)****
>
>             at java.lang.Thread.run(Thread.java:680)****
>
> Caused by: java.net.UnknownHostException: bdatadev****
>
>             ... 27 more****
>
> ** **
>
> However nowhere in my code a host named "bdatadev" is mentioned and also,
> I cannot ping this host.****
>
> ** **
>
> The section of the code where I am setting up the MRPipeline is as follows:
> ****
>
> ** **
>
> Configuration conf = HBaseConfiguration.create();****
>
> ** **
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");****
>
> conf.set("mapred.job.tracker", "<server_address>:8021");****
>
> ** **
>
> System.out.println("Hadoop configuration created.");****
>
> System.out.println("Initializing crunch pipeline ...");****
>
> ** **
>
> conf.set("mapred.jar", "<path_to_jar_file>");****
>
> ** **
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);****
>
> ** **
>
> Has anyone faced this issue before and knows how to resolve it/point out
> if I am missing anything?****
>
> ** **
>
> Thanks,****
>
> Narlin.****
>