You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tajo.apache.org by Hyunsik Choi <hy...@apache.org> on 2014/02/25 02:19:41 UTC

[FYI] some remarkable features in hadoop 2.3.0

Hi folks,

As you already know, Hadoop 2.3.0 release. While I'm reading the changes, I
noted some new features that Tajo should consider.

Centralized cache management in HDFS
  - https://issues.apache.org/jira/browse/HDFS-4949

Ealier, Min mentioned cached table. In offline, I discussed HDFS-4949 with
him. It may be a candidate feature for our goal.

Enable support for heterogeneous storages in HDFS - DN as a collection of
storages
 - https://issues.apache.org/jira/browse/HDFS-2832

It's for different storage medias like SSD and HDD.

Add a directbuffer Decompressor API to hadoop
  - https://issues.apache.org/jira/browse/HADOOP-10047

We already use compression/decompression in text file. We also should adopt
comp/decomp to other file formats. For that, HDFS-10047 may be a nice
candidate feature to be used.

- hyunsik

Re: [FYI] some remarkable features in hadoop 2.3.0

Posted by CharSyam <ch...@gmail.com>.

+1


2014-02-25 10:48 GMT+09:00 Min Zhou <co...@gmail.com>:

> Hi,
>
> +1.
> It's happy to see 2.3.0 get released. Actually, I did something dev on
> zero-copy processing tuples based on that version. But haven't finished
> yet, I can create a ticket for that.  Anyway, currently the zero-copy
> decompressors in hadoop have only two,  default and snappy. Do you guys
> decide to support lzo or any other type of compressions?
>
> Regards,
> Min
>
>
> On Mon, Feb 24, 2014 at 5:23 PM, JaeHwa Jung <jh...@gruter.com> wrote:
>
> > +1.
> >
> > Thanks Hyunsik, I also agree with you.
> > We need to bump up hadoop to 2.3.0.
> >
> > Cheers
> >
> >
> > 2014-02-25 10:19 GMT+09:00 Hyunsik Choi <hy...@apache.org>:
> >
> > > Hi folks,
> > >
> > > As you already know, Hadoop 2.3.0 release. While I'm reading the
> > changes, I
> > > noted some new features that Tajo should consider.
> > >
> > > Centralized cache management in HDFS
> > >   - https://issues.apache.org/jira/browse/HDFS-4949
> > >
> > > Ealier, Min mentioned cached table. In offline, I discussed HDFS-4949
> > with
> > > him. It may be a candidate feature for our goal.
> > >
> > > Enable support for heterogeneous storages in HDFS - DN as a collection
> of
> > > storages
> > >  - https://issues.apache.org/jira/browse/HDFS-2832
> > >
> > > It's for different storage medias like SSD and HDD.
> > >
> > > Add a directbuffer Decompressor API to hadoop
> > >   - https://issues.apache.org/jira/browse/HADOOP-10047
> > >
> > > We already use compression/decompression in text file. We also should
> > adopt
> > > comp/decomp to other file formats. For that, HDFS-10047 may be a nice
> > > candidate feature to be used.
> > >
> > > - hyunsik
> > >
> >
> >
> >
> > --
> > Thanks,
> > Jaehwa Jung
> > Bigdata Platform Team
> > Gruter
> >
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>

Re: [FYI] some remarkable features in hadoop 2.3.0

Posted by Jihoon Son <ji...@apache.org>.

Thanks for sharing, Hyunsik!

Jihoon


2014-02-25 11:26 GMT+09:00 Jinho Kim <ji...@gmail.com>:

> Interesting news
>
> I'm looking forward to the De/Compression
>
>
>
> https://issues.apache.org/jira/browse/HADOOP-10047?focusedCommentId=13815175&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13815175
>
> --Jinho
> Best regards
>
>
> 2014-02-25 11:13 GMT+09:00 Hyunsik Choi <hy...@apache.org>:
>
> > Hi Min,
> >
> > HADOOP-10047 seems to not support other codecs like LZO. Although we
> > firstly consider snappy and deflate that Hadoop currently supports, there
> > is no reason to not support other codecs if users want :)
> >
> > In addition, zero-copy tuple looks very interesting. If you create the
> > issue, I'll appreciate it..
> >
> > - hyunsik
> >
> >
> >
> > On Tue, Feb 25, 2014 at 10:48 AM, Min Zhou <co...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > +1.
> > > It's happy to see 2.3.0 get released. Actually, I did something dev on
> > > zero-copy processing tuples based on that version. But haven't finished
> > > yet, I can create a ticket for that.  Anyway, currently the zero-copy
> > > decompressors in hadoop have only two,  default and snappy. Do you guys
> > > decide to support lzo or any other type of compressions?
> > >
> > > Regards,
> > > Min
> > >
> > >
> > > On Mon, Feb 24, 2014 at 5:23 PM, JaeHwa Jung <jh...@gruter.com>
> wrote:
> > >
> > > > +1.
> > > >
> > > > Thanks Hyunsik, I also agree with you.
> > > > We need to bump up hadoop to 2.3.0.
> > > >
> > > > Cheers
> > > >
> > > >
> > > > 2014-02-25 10:19 GMT+09:00 Hyunsik Choi <hy...@apache.org>:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > As you already know, Hadoop 2.3.0 release. While I'm reading the
> > > > changes, I
> > > > > noted some new features that Tajo should consider.
> > > > >
> > > > > Centralized cache management in HDFS
> > > > >   - https://issues.apache.org/jira/browse/HDFS-4949
> > > > >
> > > > > Ealier, Min mentioned cached table. In offline, I discussed
> HDFS-4949
> > > > with
> > > > > him. It may be a candidate feature for our goal.
> > > > >
> > > > > Enable support for heterogeneous storages in HDFS - DN as a
> > collection
> > > of
> > > > > storages
> > > > >  - https://issues.apache.org/jira/browse/HDFS-2832
> > > > >
> > > > > It's for different storage medias like SSD and HDD.
> > > > >
> > > > > Add a directbuffer Decompressor API to hadoop
> > > > >   - https://issues.apache.org/jira/browse/HADOOP-10047
> > > > >
> > > > > We already use compression/decompression in text file. We also
> should
> > > > adopt
> > > > > comp/decomp to other file formats. For that, HDFS-10047 may be a
> nice
> > > > > candidate feature to be used.
> > > > >
> > > > > - hyunsik
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks,
> > > > Jaehwa Jung
> > > > Bigdata Platform Team
> > > > Gruter
> > > >
> > >
> > >
> > >
> > > --
> > > My research interests are distributed systems, parallel computing and
> > > bytecode based virtual machine.
> > >
> > > My profile:
> > > http://www.linkedin.com/in/coderplay
> > > My blog:
> > > http://coderplay.javaeye.com
> > >
> >
>

Re: [FYI] some remarkable features in hadoop 2.3.0

Posted by Jinho Kim <ji...@gmail.com>.

Interesting news

I'm looking forward to the De/Compression


https://issues.apache.org/jira/browse/HADOOP-10047?focusedCommentId=13815175&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13815175

--Jinho
Best regards


2014-02-25 11:13 GMT+09:00 Hyunsik Choi <hy...@apache.org>:

> Hi Min,
>
> HADOOP-10047 seems to not support other codecs like LZO. Although we
> firstly consider snappy and deflate that Hadoop currently supports, there
> is no reason to not support other codecs if users want :)
>
> In addition, zero-copy tuple looks very interesting. If you create the
> issue, I'll appreciate it..
>
> - hyunsik
>
>
>
> On Tue, Feb 25, 2014 at 10:48 AM, Min Zhou <co...@gmail.com> wrote:
>
> > Hi,
> >
> > +1.
> > It's happy to see 2.3.0 get released. Actually, I did something dev on
> > zero-copy processing tuples based on that version. But haven't finished
> > yet, I can create a ticket for that.  Anyway, currently the zero-copy
> > decompressors in hadoop have only two,  default and snappy. Do you guys
> > decide to support lzo or any other type of compressions?
> >
> > Regards,
> > Min
> >
> >
> > On Mon, Feb 24, 2014 at 5:23 PM, JaeHwa Jung <jh...@gruter.com> wrote:
> >
> > > +1.
> > >
> > > Thanks Hyunsik, I also agree with you.
> > > We need to bump up hadoop to 2.3.0.
> > >
> > > Cheers
> > >
> > >
> > > 2014-02-25 10:19 GMT+09:00 Hyunsik Choi <hy...@apache.org>:
> > >
> > > > Hi folks,
> > > >
> > > > As you already know, Hadoop 2.3.0 release. While I'm reading the
> > > changes, I
> > > > noted some new features that Tajo should consider.
> > > >
> > > > Centralized cache management in HDFS
> > > >   - https://issues.apache.org/jira/browse/HDFS-4949
> > > >
> > > > Ealier, Min mentioned cached table. In offline, I discussed HDFS-4949
> > > with
> > > > him. It may be a candidate feature for our goal.
> > > >
> > > > Enable support for heterogeneous storages in HDFS - DN as a
> collection
> > of
> > > > storages
> > > >  - https://issues.apache.org/jira/browse/HDFS-2832
> > > >
> > > > It's for different storage medias like SSD and HDD.
> > > >
> > > > Add a directbuffer Decompressor API to hadoop
> > > >   - https://issues.apache.org/jira/browse/HADOOP-10047
> > > >
> > > > We already use compression/decompression in text file. We also should
> > > adopt
> > > > comp/decomp to other file formats. For that, HDFS-10047 may be a nice
> > > > candidate feature to be used.
> > > >
> > > > - hyunsik
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Jaehwa Jung
> > > Bigdata Platform Team
> > > Gruter
> > >
> >
> >
> >
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
> >
>

Re: [FYI] some remarkable features in hadoop 2.3.0

Posted by Hyunsik Choi <hy...@apache.org>.

Hi Min,

HADOOP-10047 seems to not support other codecs like LZO. Although we
firstly consider snappy and deflate that Hadoop currently supports, there
is no reason to not support other codecs if users want :)

In addition, zero-copy tuple looks very interesting. If you create the
issue, I'll appreciate it..

- hyunsik



On Tue, Feb 25, 2014 at 10:48 AM, Min Zhou <co...@gmail.com> wrote:

> Hi,
>
> +1.
> It's happy to see 2.3.0 get released. Actually, I did something dev on
> zero-copy processing tuples based on that version. But haven't finished
> yet, I can create a ticket for that.  Anyway, currently the zero-copy
> decompressors in hadoop have only two,  default and snappy. Do you guys
> decide to support lzo or any other type of compressions?
>
> Regards,
> Min
>
>
> On Mon, Feb 24, 2014 at 5:23 PM, JaeHwa Jung <jh...@gruter.com> wrote:
>
> > +1.
> >
> > Thanks Hyunsik, I also agree with you.
> > We need to bump up hadoop to 2.3.0.
> >
> > Cheers
> >
> >
> > 2014-02-25 10:19 GMT+09:00 Hyunsik Choi <hy...@apache.org>:
> >
> > > Hi folks,
> > >
> > > As you already know, Hadoop 2.3.0 release. While I'm reading the
> > changes, I
> > > noted some new features that Tajo should consider.
> > >
> > > Centralized cache management in HDFS
> > >   - https://issues.apache.org/jira/browse/HDFS-4949
> > >
> > > Ealier, Min mentioned cached table. In offline, I discussed HDFS-4949
> > with
> > > him. It may be a candidate feature for our goal.
> > >
> > > Enable support for heterogeneous storages in HDFS - DN as a collection
> of
> > > storages
> > >  - https://issues.apache.org/jira/browse/HDFS-2832
> > >
> > > It's for different storage medias like SSD and HDD.
> > >
> > > Add a directbuffer Decompressor API to hadoop
> > >   - https://issues.apache.org/jira/browse/HADOOP-10047
> > >
> > > We already use compression/decompression in text file. We also should
> > adopt
> > > comp/decomp to other file formats. For that, HDFS-10047 may be a nice
> > > candidate feature to be used.
> > >
> > > - hyunsik
> > >
> >
> >
> >
> > --
> > Thanks,
> > Jaehwa Jung
> > Bigdata Platform Team
> > Gruter
> >
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>

Re: [FYI] some remarkable features in hadoop 2.3.0

Posted by Min Zhou <co...@gmail.com>.

Hi,

+1.
It's happy to see 2.3.0 get released. Actually, I did something dev on
zero-copy processing tuples based on that version. But haven't finished
yet, I can create a ticket for that.  Anyway, currently the zero-copy
decompressors in hadoop have only two,  default and snappy. Do you guys
decide to support lzo or any other type of compressions?

Regards,
Min


On Mon, Feb 24, 2014 at 5:23 PM, JaeHwa Jung <jh...@gruter.com> wrote:

> +1.
>
> Thanks Hyunsik, I also agree with you.
> We need to bump up hadoop to 2.3.0.
>
> Cheers
>
>
> 2014-02-25 10:19 GMT+09:00 Hyunsik Choi <hy...@apache.org>:
>
> > Hi folks,
> >
> > As you already know, Hadoop 2.3.0 release. While I'm reading the
> changes, I
> > noted some new features that Tajo should consider.
> >
> > Centralized cache management in HDFS
> >   - https://issues.apache.org/jira/browse/HDFS-4949
> >
> > Ealier, Min mentioned cached table. In offline, I discussed HDFS-4949
> with
> > him. It may be a candidate feature for our goal.
> >
> > Enable support for heterogeneous storages in HDFS - DN as a collection of
> > storages
> >  - https://issues.apache.org/jira/browse/HDFS-2832
> >
> > It's for different storage medias like SSD and HDD.
> >
> > Add a directbuffer Decompressor API to hadoop
> >   - https://issues.apache.org/jira/browse/HADOOP-10047
> >
> > We already use compression/decompression in text file. We also should
> adopt
> > comp/decomp to other file formats. For that, HDFS-10047 may be a nice
> > candidate feature to be used.
> >
> > - hyunsik
> >
>
>
>
> --
> Thanks,
> Jaehwa Jung
> Bigdata Platform Team
> Gruter
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: [FYI] some remarkable features in hadoop 2.3.0

Posted by JaeHwa Jung <jh...@gruter.com>.

+1.

Thanks Hyunsik, I also agree with you.
We need to bump up hadoop to 2.3.0.

Cheers


2014-02-25 10:19 GMT+09:00 Hyunsik Choi <hy...@apache.org>:

> Hi folks,
>
> As you already know, Hadoop 2.3.0 release. While I'm reading the changes, I
> noted some new features that Tajo should consider.
>
> Centralized cache management in HDFS
>   - https://issues.apache.org/jira/browse/HDFS-4949
>
> Ealier, Min mentioned cached table. In offline, I discussed HDFS-4949 with
> him. It may be a candidate feature for our goal.
>
> Enable support for heterogeneous storages in HDFS - DN as a collection of
> storages
>  - https://issues.apache.org/jira/browse/HDFS-2832
>
> It's for different storage medias like SSD and HDD.
>
> Add a directbuffer Decompressor API to hadoop
>   - https://issues.apache.org/jira/browse/HADOOP-10047
>
> We already use compression/decompression in text file. We also should adopt
> comp/decomp to other file formats. For that, HDFS-10047 may be a nice
> candidate feature to be used.
>
> - hyunsik
>



-- 
Thanks,
Jaehwa Jung
Bigdata Platform Team
Gruter