You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Ravi Phulari <ip...@gmail.com> on 2011/01/04 23:10:55 UTC

HFTP based distcp failing.

Hello Hadoopers,
  I need to distcp data across two clusters. For security reasons I can not
use hdfs based distcp.
HFTP based distcp is failing  with following Ioexception.

Stack trace.

Copy failed: java.io.IOException: Not supported
    at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
    at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)

I am using following command for distcp.

Hadoop distcp  hftp://nn1.hadoop1:50070/data
hftp://nn2.hadoop2:50070/user/hadoop/
Hadoop distcp  /data/logs    hftp://nn2.hadoop2:50070/user/hadoop/

Any idea why this distcp could be failing.
I don’t see any logs in JT and NN.

Any help will be greatly appreciated.

-
Thanks,
Ravi

Re: HFTP based distcp failing.

Posted by Lars George <la...@gmail.com>.
Hi Ravi,

With distcp you usually run it on the target cluster and specify the
source with the "hftp://" URI and the target with "hdfs://". hftp is a
read only, http based access to the data.

Lars

On Tue, Jan 4, 2011 at 11:10 PM, Ravi Phulari <ip...@gmail.com> wrote:
> Hello Hadoopers,
>  I need to distcp data across two clusters. For security reasons I can not
> use hdfs based distcp.
> HFTP based distcp is failing  with following Ioexception.
>
> Stack trace.
>
> Copy failed: java.io.IOException: Not supported
>    at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
>    at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
>    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
>    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
>
> I am using following command for distcp.
>
> Hadoop distcp  hftp://nn1.hadoop1:50070/data
> hftp://nn2.hadoop2:50070/user/hadoop/
> Hadoop distcp  /data/logs    hftp://nn2.hadoop2:50070/user/hadoop/
>
> Any idea why this distcp could be failing.
> I don’t see any logs in JT and NN.
>
> Any help will be greatly appreciated.
>
> -
> Thanks,
> Ravi
>

Re: HFTP based distcp failing.

Posted by Ravi Phulari <ip...@gmail.com>.
Thanks Kan,

Then I think DistCp documentation needs to be corrected.

http://hadoop.apache.org/common/docs/r0.20.2/distcp.html

 For copying between two different versions of Hadoop, one will usually use
HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on the
destination cluster (more specifically, on TaskTrackers that can write to
the destination cluster). Each source is specified as
hftp://<dfs.http.address>/<path> (the default dfs.http.address is
<namenode>:50070).

-
Ravi

On Tue, Jan 4, 2011 at 2:50 PM, Kan Zhang <ka...@yahoo-inc.com> wrote:

>  Ravi, HFTP  only supports READ operation for now.
>
>
>
> On 1/4/11 2:10 PM, "Ravi Phulari" <ip...@gmail.com> wrote:
>
> Hello Hadoopers,
>   I need to distcp data across two clusters. For security reasons I can not
> use hdfs based distcp.
> HFTP based distcp is failing  with following Ioexception.
>
> Stack trace.
>
> Copy failed: java.io.IOException: Not supported
>     at
> org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
>     at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
>
> I am using following command for distcp.
>
> Hadoop distcp  hftp://nn1.hadoop1:50070/data
> hftp://nn2.hadoop2:50070/user/hadoop/
> Hadoop distcp  /data/logs    hftp://nn2.hadoop2:50070/user/hadoop/
>
> Any idea why this distcp could be failing.
> I don’t see any logs in JT and NN.
>
> Any help will be greatly appreciated.
>
> -
> Thanks,
> Ravi
>
>

Re: HFTP based distcp failing.

Posted by Kan Zhang <ka...@yahoo-inc.com>.
Ravi, HFTP  only supports READ operation for now.


On 1/4/11 2:10 PM, "Ravi Phulari" <ip...@gmail.com> wrote:

Hello Hadoopers,
  I need to distcp data across two clusters. For security reasons I can not use hdfs based distcp.
HFTP based distcp is failing  with following Ioexception.

Stack trace.

Copy failed: java.io.IOException: Not supported
    at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360)
    at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)

I am using following command for distcp.

Hadoop distcp  hftp://nn1.hadoop1:50070/data hftp://nn2.hadoop2:50070/user/hadoop/
Hadoop distcp  /data/logs    hftp://nn2.hadoop2:50070/user/hadoop/

Any idea why this distcp could be failing.
I don't see any logs in JT and NN.

Any help will be greatly appreciated.

-
Thanks,
Ravi