You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jérôme BAROTIN <je...@barotin.fr> on 2016/06/28 16:36:22 UTC

cp command in webhdfs (and Filesystem Java Object)

Hello,

I'm writing this email, because, I spent one hour to look for a cp command
in the webhdfs API (in fact, I'm using HTTPFS, but I think it's the same).

This command is implemented in the "hdfs dfs" command line client (and I'm
using this command), but, I can't find it on the webhdfs REST API. I
thought that webhdfs is an implementation of the Filesystem object (
https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html).
I checked at the Java API and I haven't found any cp command. The only java
cp command is on the FileUtil Object (
https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
and I'm not sure that it work identicaly than "hdfs dfs -cp" command.

I also checked at the Hadoop JIRA, and I found nothing :
https://issues.apache.org/jira/browse/HADOOP-9417?jql=project%20%3D%20HADOOP%20AND%20(text%20~%20%22webhdfs%20copy%22%20OR%20text%20~%20%22webhdfs%20cp%22)

is there a way to execute a cp command through a REST API ?

All my best,


Jérôme

Re: cp command in webhdfs (and Filesystem Java Object)

Posted by Jérôme BAROTIN <je...@barotin.fr>.
Thanks for your response Chris, so I understand that there are no standard
implementation of cp as a REST API ?

You mention that cp is a combination of "open, create and rename" all of
theses method are available thought webhdfs. Do you think that we can re
product remotely though execute several REST call ? (I mean without
transferring data on client side)

Otherwise, if I want to build my own hdfs cp API REST (in Java), do you
think I should use the copy method of the FileUtil Object (
https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
?

Best regards,

Jérôme

2016-06-29 17:36 GMT+02:00 Chris Nauroth <cn...@hortonworks.com>:

> Hello Jérôme,
>
> WebHDFS provides an HTTP binding to the FileSystem API, which defines the
> primitive operations offered by the file system.  The FileSystem Shell
> builds on top of the FileSystem API to provide higher-level workflows,
> implemented using the FileSystem primitives.  In the case of "cp", copy is
> not a primitive operation defined by the FileSystem API.  Instead, the
> FileSystem Shell implements it by composing a few different FileSystem API
> primitives: open, create and rename.
>
> Due to this separation, you won't find a "cp" operation directly in the
> WebHDFS REST API (or HTTPFS).  However, it is possible for the FileSystem
> shell to reference paths as URIs using the "webhdfs" scheme.  For example:
>
> > hadoop fs -cp webhdfs://localhost:9870/hello1
> webhdfs://localhost:9870/hello2
>
> > hadoop fs -cat webhdfs://localhost:9870/hello2
> hello
>
> --Chris Nauroth
>
> From: Jérôme BAROTIN <je...@barotin.fr>
> Date: Wednesday, June 29, 2016 at 12:44 AM
> To: Rohan Rajeevan <ro...@gmail.com>
> Cc: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: cp command in webhdfs (and Filesystem Java Object)
>
> I'm not thinking that is the same :
> - CREATE is for a local file : in my case, I just want to copy one hdfs
> path to another on the same cluster
> - Distcp, is for copying file between two differents clusters.
>
> I'm using HTTPFs/webhdfsREST API to acces to my cluster, and I need to
> execute a "cp" command. How can I do that ?
>
> Do I need to develop this service ?
>
> Jérôme
>
> 2016-06-29 8:17 GMT+02:00 Rohan Rajeevan <ro...@gmail.com>:
>
>>
>> May be look at this?
>> https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
>> If you are interested in intra cluster copy, may look at DistCp
>> <https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html>?
>>
>> On Tue, Jun 28, 2016 at 9:36 AM, Jérôme BAROTIN <je...@barotin.fr>
>> wrote:
>>
>>> Hello,
>>>
>>> I'm writing this email, because, I spent one hour to look for a cp
>>> command in the webhdfs API (in fact, I'm using HTTPFS, but I think it's the
>>> same).
>>>
>>> This command is implemented in the "hdfs dfs" command line client (and
>>> I'm using this command), but, I can't find it on the webhdfs REST API. I
>>> thought that webhdfs is an implementation of the Filesystem object (
>>> https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html).
>>> I checked at the Java API and I haven't found any cp command. The only java
>>> cp command is on the FileUtil Object (
>>> https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
>>> and I'm not sure that it work identicaly than "hdfs dfs -cp" command.
>>>
>>> I also checked at the Hadoop JIRA, and I found nothing :
>>> https://issues.apache.org/jira/browse/HADOOP-9417?jql=project%20%3D%20HADOOP%20AND%20(text%20~%20%22webhdfs%20copy%22%20OR%20text%20~%20%22webhdfs%20cp%22)
>>>
>>> is there a way to execute a cp command through a REST API ?
>>>
>>> All my best,
>>>
>>>
>>> Jérôme
>>>
>>
>>
>

Re: cp command in webhdfs (and Filesystem Java Object)

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Jérôme,

WebHDFS provides an HTTP binding to the FileSystem API, which defines the primitive operations offered by the file system.  The FileSystem Shell builds on top of the FileSystem API to provide higher-level workflows, implemented using the FileSystem primitives.  In the case of "cp", copy is not a primitive operation defined by the FileSystem API.  Instead, the FileSystem Shell implements it by composing a few different FileSystem API primitives: open, create and rename.

Due to this separation, you won't find a "cp" operation directly in the WebHDFS REST API (or HTTPFS).  However, it is possible for the FileSystem shell to reference paths as URIs using the "webhdfs" scheme.  For example:

> hadoop fs -cp webhdfs://localhost:9870/hello1 webhdfs://localhost:9870/hello2

> hadoop fs -cat webhdfs://localhost:9870/hello2
hello

--Chris Nauroth

From: Jérôme BAROTIN <je...@barotin.fr>>
Date: Wednesday, June 29, 2016 at 12:44 AM
To: Rohan Rajeevan <ro...@gmail.com>>
Cc: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: cp command in webhdfs (and Filesystem Java Object)

I'm not thinking that is the same :
- CREATE is for a local file : in my case, I just want to copy one hdfs path to another on the same cluster
- Distcp, is for copying file between two differents clusters.

I'm using HTTPFs/webhdfsREST API to acces to my cluster, and I need to execute a "cp" command. How can I do that ?

Do I need to develop this service ?

Jérôme

2016-06-29 8:17 GMT+02:00 Rohan Rajeevan <ro...@gmail.com>>:

May be look at this? https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
If you are interested in intra cluster copy, may look at DistCp<https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html>?

On Tue, Jun 28, 2016 at 9:36 AM, Jérôme BAROTIN <je...@barotin.fr>> wrote:
Hello,

I'm writing this email, because, I spent one hour to look for a cp command in the webhdfs API (in fact, I'm using HTTPFS, but I think it's the same).

This command is implemented in the "hdfs dfs" command line client (and I'm using this command), but, I can't find it on the webhdfs REST API. I thought that webhdfs is an implementation of the Filesystem object (https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html). I checked at the Java API and I haven't found any cp command. The only java cp command is on the FileUtil Object (https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html) and I'm not sure that it work identicaly than "hdfs dfs -cp" command.

I also checked at the Hadoop JIRA, and I found nothing : https://issues.apache.org/jira/browse/HADOOP-9417?jql=project%20%3D%20HADOOP%20AND%20(text%20~%20%22webhdfs%20copy%22%20OR%20text%20~%20%22webhdfs%20cp%22)

is there a way to execute a cp command through a REST API ?

All my best,


Jérôme



Re: cp command in webhdfs (and Filesystem Java Object)

Posted by Jérôme BAROTIN <je...@barotin.fr>.
I'm not thinking that is the same :
- CREATE is for a local file : in my case, I just want to copy one hdfs
path to another on the same cluster
- Distcp, is for copying file between two differents clusters.

I'm using HTTPFs/webhdfsREST API to acces to my cluster, and I need to
execute a "cp" command. How can I do that ?

Do I need to develop this service ?

Jérôme

2016-06-29 8:17 GMT+02:00 Rohan Rajeevan <ro...@gmail.com>:

>
> May be look at this?
> https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
> If you are interested in intra cluster copy, may look at DistCp
> <https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html>?
>
> On Tue, Jun 28, 2016 at 9:36 AM, Jérôme BAROTIN <je...@barotin.fr> wrote:
>
>> Hello,
>>
>> I'm writing this email, because, I spent one hour to look for a cp
>> command in the webhdfs API (in fact, I'm using HTTPFS, but I think it's the
>> same).
>>
>> This command is implemented in the "hdfs dfs" command line client (and
>> I'm using this command), but, I can't find it on the webhdfs REST API. I
>> thought that webhdfs is an implementation of the Filesystem object (
>> https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html).
>> I checked at the Java API and I haven't found any cp command. The only java
>> cp command is on the FileUtil Object (
>> https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
>> and I'm not sure that it work identicaly than "hdfs dfs -cp" command.
>>
>> I also checked at the Hadoop JIRA, and I found nothing :
>> https://issues.apache.org/jira/browse/HADOOP-9417?jql=project%20%3D%20HADOOP%20AND%20(text%20~%20%22webhdfs%20copy%22%20OR%20text%20~%20%22webhdfs%20cp%22)
>>
>> is there a way to execute a cp command through a REST API ?
>>
>> All my best,
>>
>>
>> Jérôme
>>
>
>

Re: cp command in webhdfs (and Filesystem Java Object)

Posted by Rohan Rajeevan <ro...@gmail.com>.
May be look at this?
https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
If you are interested in intra cluster copy, may look at DistCp
<https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html>?

On Tue, Jun 28, 2016 at 9:36 AM, Jérôme BAROTIN <je...@barotin.fr> wrote:

> Hello,
>
> I'm writing this email, because, I spent one hour to look for a cp command
> in the webhdfs API (in fact, I'm using HTTPFS, but I think it's the same).
>
> This command is implemented in the "hdfs dfs" command line client (and I'm
> using this command), but, I can't find it on the webhdfs REST API. I
> thought that webhdfs is an implementation of the Filesystem object (
> https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html).
> I checked at the Java API and I haven't found any cp command. The only java
> cp command is on the FileUtil Object (
> https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
> and I'm not sure that it work identicaly than "hdfs dfs -cp" command.
>
> I also checked at the Hadoop JIRA, and I found nothing :
> https://issues.apache.org/jira/browse/HADOOP-9417?jql=project%20%3D%20HADOOP%20AND%20(text%20~%20%22webhdfs%20copy%22%20OR%20text%20~%20%22webhdfs%20cp%22)
>
> is there a way to execute a cp command through a REST API ?
>
> All my best,
>
>
> Jérôme
>