You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Ravi Phulari <ip...@gmail.com> on 2011/01/04 23:10:55 UTC
HFTP based distcp failing.
Hello Hadoopers,
I need to distcp data across two clusters. For security reasons I can not
use hdfs based distcp.
HFTP based distcp is failing with following Ioexception.
Stack trace.
Copy failed: java.io.IOException: Not supported
at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
I am using following command for distcp.
Hadoop distcp hftp://nn1.hadoop1:50070/data
hftp://nn2.hadoop2:50070/user/hadoop/
Hadoop distcp /data/logs hftp://nn2.hadoop2:50070/user/hadoop/
Any idea why this distcp could be failing.
I don’t see any logs in JT and NN.
Any help will be greatly appreciated.
-
Thanks,
Ravi
Re: HFTP based distcp failing.
Posted by Lars George <la...@gmail.com>.
Hi Ravi,
With distcp you usually run it on the target cluster and specify the
source with the "hftp://" URI and the target with "hdfs://". hftp is a
read only, http based access to the data.
Lars
On Tue, Jan 4, 2011 at 11:10 PM, Ravi Phulari <ip...@gmail.com> wrote:
> Hello Hadoopers,
> I need to distcp data across two clusters. For security reasons I can not
> use hdfs based distcp.
> HFTP based distcp is failing with following Ioexception.
>
> Stack trace.
>
> Copy failed: java.io.IOException: Not supported
> at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
> at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
> at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
>
> I am using following command for distcp.
>
> Hadoop distcp hftp://nn1.hadoop1:50070/data
> hftp://nn2.hadoop2:50070/user/hadoop/
> Hadoop distcp /data/logs hftp://nn2.hadoop2:50070/user/hadoop/
>
> Any idea why this distcp could be failing.
> I don’t see any logs in JT and NN.
>
> Any help will be greatly appreciated.
>
> -
> Thanks,
> Ravi
>
Re: HFTP based distcp failing.
Posted by Ravi Phulari <ip...@gmail.com>.
Thanks Kan,
Then I think DistCp documentation needs to be corrected.
http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
For copying between two different versions of Hadoop, one will usually use
HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on the
destination cluster (more specifically, on TaskTrackers that can write to
the destination cluster). Each source is specified as
hftp://<dfs.http.address>/<path> (the default dfs.http.address is
<namenode>:50070).
-
Ravi
On Tue, Jan 4, 2011 at 2:50 PM, Kan Zhang <ka...@yahoo-inc.com> wrote:
> Ravi, HFTP only supports READ operation for now.
>
>
>
> On 1/4/11 2:10 PM, "Ravi Phulari" <ip...@gmail.com> wrote:
>
> Hello Hadoopers,
> I need to distcp data across two clusters. For security reasons I can not
> use hdfs based distcp.
> HFTP based distcp is failing with following Ioexception.
>
> Stack trace.
>
> Copy failed: java.io.IOException: Not supported
> at
> org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
> at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
> at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
>
> I am using following command for distcp.
>
> Hadoop distcp hftp://nn1.hadoop1:50070/data
> hftp://nn2.hadoop2:50070/user/hadoop/
> Hadoop distcp /data/logs hftp://nn2.hadoop2:50070/user/hadoop/
>
> Any idea why this distcp could be failing.
> I don’t see any logs in JT and NN.
>
> Any help will be greatly appreciated.
>
> -
> Thanks,
> Ravi
>
>
Re: HFTP based distcp failing.
Posted by Kan Zhang <ka...@yahoo-inc.com>.
Ravi, HFTP only supports READ operation for now.
On 1/4/11 2:10 PM, "Ravi Phulari" <ip...@gmail.com> wrote:
Hello Hadoopers,
I need to distcp data across two clusters. For security reasons I can not use hdfs based distcp.
HFTP based distcp is failing with following Ioexception.
Stack trace.
Copy failed: java.io.IOException: Not supported
at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
I am using following command for distcp.
Hadoop distcp hftp://nn1.hadoop1:50070/data hftp://nn2.hadoop2:50070/user/hadoop/
Hadoop distcp /data/logs hftp://nn2.hadoop2:50070/user/hadoop/
Any idea why this distcp could be failing.
I don't see any logs in JT and NN.
Any help will be greatly appreciated.
-
Thanks,
Ravi