You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Akira AJISAKA <aj...@oss.nttdata.co.jp> on 2014/04/25 10:17:11 UTC

Re: Text cmd in webhdfs

# Added hdfs-dev@

Hi Nikita,

I'm personally very interested in the functionality!
Please create a issue on ASF JIRA and attach your patch.

Here is the documentation of creating a patch and attaching it.
http://wiki.apache.org/hadoop/HowToContribute

Thanks,
Akira

(2014/04/25 2:09), Nikita Makeev wrote:
> In my company we're using webhdfs a lot. One of usage pattern is exporting
> files from hdfs to external systems which was done before webhdfs by
> calling 'hadoop fs -text' to local filesystem and then transferring to
> final destination. We're using 'text' because of storing data on hdfs in
> sequence files and our external systems can't read them. We find webhdfs
> very handy, particularly because of very little overhead for starting up
> compared to 'hadoop fs', so we made a patch to make datanode capable of
> doing 'text' via webhdfs. Though I think our usage pattern is hardly
> common, I still asking the community, is there any interest in such
> functionality.
>
> SY,
>
>    Nikita Makeev
>


Re: Text cmd in webhdfs

Posted by Nikita Makeev <wh...@gmail.com>.
Any other thoughts regarding that?

Separate tool for converting sequence files to plain text would do, but it
probably should be re-implemented in C to be as efficient as our current
approach. Some time ago I was trying to find such implementation but failed.

Nikita


On Fri, Apr 25, 2014 at 11:00 PM, Haohui Mai <hm...@hortonworks.com> wrote:

> I suggest to build a separate tool streaming from a fs instead of baking it
> into webhdfs. One goal in webhdfs is to minimize the dependency of both the
> webhdfs server and the client so that other projects easily adopt it.
>
> If hadoop fs is too slow, it might be a good idea to build a new tool that
> reads from stdio, and parses the text format. For example, you can use it
> like
>
> curl http://foo/webhdfs/v1?op=OPEN|parse-text
>
> ~Haohui
>
> On Fri, Apr 25, 2014 at 3:16 AM, Nikita Makeev <wh...@gmail.com>
> wrote:
>
> > Hi.
> >
> > Sure I will, it will just take some time, as current patch is against
> 0.20
> > and 2.0.0 and I'm going to add some more unit tests and make sure the
> patch
> > conforms to requirements.
> > Thanks.
> >
> > Nikita
> >
> >
> > On Fri, Apr 25, 2014 at 12:17 PM, Akira AJISAKA
> > <aj...@oss.nttdata.co.jp>wrote:
> >
> > > # Added hdfs-dev@
> > >
> > > Hi Nikita,
> > >
> > > I'm personally very interested in the functionality!
> > > Please create a issue on ASF JIRA and attach your patch.
> > >
> > > Here is the documentation of creating a patch and attaching it.
> > > http://wiki.apache.org/hadoop/HowToContribute
> > >
> > > Thanks,
> > > Akira
> > >
> > >
> > > (2014/04/25 2:09), Nikita Makeev wrote:
> > >
> > >> In my company we're using webhdfs a lot. One of usage pattern is
> > exporting
> > >> files from hdfs to external systems which was done before webhdfs by
> > >> calling 'hadoop fs -text' to local filesystem and then transferring to
> > >> final destination. We're using 'text' because of storing data on hdfs
> in
> > >> sequence files and our external systems can't read them. We find
> webhdfs
> > >> very handy, particularly because of very little overhead for starting
> up
> > >> compared to 'hadoop fs', so we made a patch to make datanode capable
> of
> > >> doing 'text' via webhdfs. Though I think our usage pattern is hardly
> > >> common, I still asking the community, is there any interest in such
> > >> functionality.
> > >>
> > >> SY,
> > >>
> > >>    Nikita Makeev
> > >>
> > >>
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Text cmd in webhdfs

Posted by Haohui Mai <hm...@hortonworks.com>.
I suggest to build a separate tool streaming from a fs instead of baking it
into webhdfs. One goal in webhdfs is to minimize the dependency of both the
webhdfs server and the client so that other projects easily adopt it.

If hadoop fs is too slow, it might be a good idea to build a new tool that
reads from stdio, and parses the text format. For example, you can use it
like

curl http://foo/webhdfs/v1?op=OPEN|parse-text

~Haohui

On Fri, Apr 25, 2014 at 3:16 AM, Nikita Makeev <wh...@gmail.com> wrote:

> Hi.
>
> Sure I will, it will just take some time, as current patch is against 0.20
> and 2.0.0 and I'm going to add some more unit tests and make sure the patch
> conforms to requirements.
> Thanks.
>
> Nikita
>
>
> On Fri, Apr 25, 2014 at 12:17 PM, Akira AJISAKA
> <aj...@oss.nttdata.co.jp>wrote:
>
> > # Added hdfs-dev@
> >
> > Hi Nikita,
> >
> > I'm personally very interested in the functionality!
> > Please create a issue on ASF JIRA and attach your patch.
> >
> > Here is the documentation of creating a patch and attaching it.
> > http://wiki.apache.org/hadoop/HowToContribute
> >
> > Thanks,
> > Akira
> >
> >
> > (2014/04/25 2:09), Nikita Makeev wrote:
> >
> >> In my company we're using webhdfs a lot. One of usage pattern is
> exporting
> >> files from hdfs to external systems which was done before webhdfs by
> >> calling 'hadoop fs -text' to local filesystem and then transferring to
> >> final destination. We're using 'text' because of storing data on hdfs in
> >> sequence files and our external systems can't read them. We find webhdfs
> >> very handy, particularly because of very little overhead for starting up
> >> compared to 'hadoop fs', so we made a patch to make datanode capable of
> >> doing 'text' via webhdfs. Though I think our usage pattern is hardly
> >> common, I still asking the community, is there any interest in such
> >> functionality.
> >>
> >> SY,
> >>
> >>    Nikita Makeev
> >>
> >>
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Text cmd in webhdfs

Posted by Nikita Makeev <wh...@gmail.com>.
Hi.

Sure I will, it will just take some time, as current patch is against 0.20
and 2.0.0 and I'm going to add some more unit tests and make sure the patch
conforms to requirements.
Thanks.

Nikita


On Fri, Apr 25, 2014 at 12:17 PM, Akira AJISAKA
<aj...@oss.nttdata.co.jp>wrote:

> # Added hdfs-dev@
>
> Hi Nikita,
>
> I'm personally very interested in the functionality!
> Please create a issue on ASF JIRA and attach your patch.
>
> Here is the documentation of creating a patch and attaching it.
> http://wiki.apache.org/hadoop/HowToContribute
>
> Thanks,
> Akira
>
>
> (2014/04/25 2:09), Nikita Makeev wrote:
>
>> In my company we're using webhdfs a lot. One of usage pattern is exporting
>> files from hdfs to external systems which was done before webhdfs by
>> calling 'hadoop fs -text' to local filesystem and then transferring to
>> final destination. We're using 'text' because of storing data on hdfs in
>> sequence files and our external systems can't read them. We find webhdfs
>> very handy, particularly because of very little overhead for starting up
>> compared to 'hadoop fs', so we made a patch to make datanode capable of
>> doing 'text' via webhdfs. Though I think our usage pattern is hardly
>> common, I still asking the community, is there any interest in such
>> functionality.
>>
>> SY,
>>
>>    Nikita Makeev
>>
>>
>