You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2011/10/05 07:15:55 UTC
Error using hadoop distcp
I am trying to use distcp to copy a file from one HDFS to another.
But while copying I am getting the following exception :
hadoop distcp hdfs://ub13:54310/user/hadoop/weblog
hdfs://ub16:54310/user/hadoop/weblog
11/10/05 10:41:01 INFO mapred.JobClient: Task Id :
attempt_201110031447_0005_m_000007_0, Status : FAILED
java.net.UnknownHostException: unknown host: ub16
at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
at org.apache.hadoop.ipc.Client.call(Client.java:720)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at
org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48)
at
org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)
at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Its saying its not finding ub16. But the entry is there in /etc/hosts files.
I am able to ssh both the machines. Do I need password less ssh between
these two NNs ?
What can be the issue ? Any thing I am missing before using distcp ?
Thanks,
Praveenesh
Re: Error using hadoop distcp
Posted by praveenesh kumar <pr...@gmail.com>.
I tried that thing also.. when I am using IP address, its saying I should
use hostname.
*hadoop@ub13:~$ hadoop distcp
hdfs://162.192.100.53:54310/user/hadoop/webloghdfs://
162.192.100.16:54310/user/hadoop/weblog*
11/10/05 14:53:50 INFO tools.DistCp: srcPaths=[hdfs://
162.192.100.53:54310/user/hadoop/weblog]
11/10/05 14:53:50 INFO tools.DistCp: destPath=hdfs://
162.192.100.16:54310/user/hadoop/weblog
java.lang.IllegalArgumentException: Wrong FS: hdfs://
162.192.100.53:54310/user/hadoop/weblog, expected: hdfs://ub13:54310
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)
at
org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:155)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:464)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648)
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
I have the entries of both machines in /etc/hosts...
On Wed, Oct 5, 2011 at 1:55 PM, <be...@gmail.com> wrote:
> Hi praveenesh
> Can you try repeating the distcp using IP instead of host name.
> From the error looks like an RPC exception not able to identify the host, so
> I believe it can't be due to not setting a password less ssh. Just try it
> out.
> Regards
> Bejoy K S
>
> -----Original Message-----
> From: trang van anh <an...@vtc.vn>
> Date: Wed, 05 Oct 2011 14:06:11
> To: <co...@hadoop.apache.org>
> Reply-To: common-user@hadoop.apache.org
> Subject: Re: Error using hadoop distcp
>
> which host run the task that throws the exception ? ensure that each
> data node know another data nodes in hadoop cluster-> add "ub16" entry
> in /etc/hosts on where the task running.
> On 10/5/2011 12:15 PM, praveenesh kumar wrote:
> > I am trying to use distcp to copy a file from one HDFS to another.
> >
> > But while copying I am getting the following exception :
> >
> > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog
> > hdfs://ub16:54310/user/hadoop/weblog
> >
> > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id :
> > attempt_201110031447_0005_m_000007_0, Status : FAILED
> > java.net.UnknownHostException: unknown host: ub16
> > at
> org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
> > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
> > at org.apache.hadoop.ipc.Client.call(Client.java:720)
> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> > at $Proxy1.getProtocolVersion(Unknown Source)
> > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
> > at
> > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)
> > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215)
> > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
> > at
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
> > at
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> > at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
> > at
> >
> org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48)
> > at
> >
> org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)
> > at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296)
> > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> > Its saying its not finding ub16. But the entry is there in /etc/hosts
> files.
> > I am able to ssh both the machines. Do I need password less ssh between
> > these two NNs ?
> > What can be the issue ? Any thing I am missing before using distcp ?
> >
> > Thanks,
> > Praveenesh
> >
>
>
Re: Error using hadoop distcp
Posted by be...@gmail.com.
Hi praveenesh
Can you try repeating the distcp using IP instead of host name. From the error looks like an RPC exception not able to identify the host, so I believe it can't be due to not setting a password less ssh. Just try it out.
Regards
Bejoy K S
-----Original Message-----
From: trang van anh <an...@vtc.vn>
Date: Wed, 05 Oct 2011 14:06:11
To: <co...@hadoop.apache.org>
Reply-To: common-user@hadoop.apache.org
Subject: Re: Error using hadoop distcp
which host run the task that throws the exception ? ensure that each
data node know another data nodes in hadoop cluster-> add "ub16" entry
in /etc/hosts on where the task running.
On 10/5/2011 12:15 PM, praveenesh kumar wrote:
> I am trying to use distcp to copy a file from one HDFS to another.
>
> But while copying I am getting the following exception :
>
> hadoop distcp hdfs://ub13:54310/user/hadoop/weblog
> hdfs://ub16:54310/user/hadoop/weblog
>
> 11/10/05 10:41:01 INFO mapred.JobClient: Task Id :
> attempt_201110031447_0005_m_000007_0, Status : FAILED
> java.net.UnknownHostException: unknown host: ub16
> at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
> at org.apache.hadoop.ipc.Client.call(Client.java:720)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> at $Proxy1.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
> at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
> at
> org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48)
> at
> org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)
> at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> Its saying its not finding ub16. But the entry is there in /etc/hosts files.
> I am able to ssh both the machines. Do I need password less ssh between
> these two NNs ?
> What can be the issue ? Any thing I am missing before using distcp ?
>
> Thanks,
> Praveenesh
>
Re: Error using hadoop distcp
Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.
Distcp will run as mapreduce job.
Here tasktrackers required the hostname mappings to contact to other nodes.
Please configure the mapping correctly in both the machines and try.
egards,
Uma
----- Original Message -----
From: trang van anh <an...@vtc.vn>
Date: Wednesday, October 5, 2011 1:41 pm
Subject: Re: Error using hadoop distcp
To: common-user@hadoop.apache.org
> which host run the task that throws the exception ? ensure that
> each
> data node know another data nodes in hadoop cluster-> add "ub16"
> entry
> in /etc/hosts on where the task running.
> On 10/5/2011 12:15 PM, praveenesh kumar wrote:
> > I am trying to use distcp to copy a file from one HDFS to another.
> >
> > But while copying I am getting the following exception :
> >
> > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog
> > hdfs://ub16:54310/user/hadoop/weblog
> >
> > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id :
> > attempt_201110031447_0005_m_000007_0, Status : FAILED
> > java.net.UnknownHostException: unknown host: ub16
> > at
> org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)>
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
> > at org.apache.hadoop.ipc.Client.call(Client.java:720)
> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> > at $Proxy1.getProtocolVersion(Unknown Source)
> > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
> > at
> >
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215)
> > at
> org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)>
> at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)> at
> >
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> > at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
> > at
> >
> org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48)> at
> >
> org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)> at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296)
> > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> > Its saying its not finding ub16. But the entry is there in
> /etc/hosts files.
> > I am able to ssh both the machines. Do I need password less ssh
> between> these two NNs ?
> > What can be the issue ? Any thing I am missing before using
> distcp ?
> >
> > Thanks,
> > Praveenesh
> >
>
>
Re: Error using hadoop distcp
Posted by trang van anh <an...@vtc.vn>.
which host run the task that throws the exception ? ensure that each
data node know another data nodes in hadoop cluster-> add "ub16" entry
in /etc/hosts on where the task running.
On 10/5/2011 12:15 PM, praveenesh kumar wrote:
> I am trying to use distcp to copy a file from one HDFS to another.
>
> But while copying I am getting the following exception :
>
> hadoop distcp hdfs://ub13:54310/user/hadoop/weblog
> hdfs://ub16:54310/user/hadoop/weblog
>
> 11/10/05 10:41:01 INFO mapred.JobClient: Task Id :
> attempt_201110031447_0005_m_000007_0, Status : FAILED
> java.net.UnknownHostException: unknown host: ub16
> at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
> at org.apache.hadoop.ipc.Client.call(Client.java:720)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> at $Proxy1.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
> at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
> at
> org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48)
> at
> org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)
> at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> Its saying its not finding ub16. But the entry is there in /etc/hosts files.
> I am able to ssh both the machines. Do I need password less ssh between
> these two NNs ?
> What can be the issue ? Any thing I am missing before using distcp ?
>
> Thanks,
> Praveenesh
>