You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Saurabh Mahapatra <sa...@gmail.com> on 2018/02/07 19:58:11 UTC
Which Hadoop File Format Should I Use?
Originally shared with me by Kuna Khatua but is a good read:
https://www.jowanza.com/blog/which-hadoop-file-format-should-i-use
The Carbondata project looks quite promising.
Any thoughts on what file format you prefer?
Thanks,
Saurabh
Re: Which Hadoop File Format Should I Use?
Posted by Aman Sinha <am...@apache.org>.
The multi-level indexing feature in Carbondata seems very interesting...it
will allow persisting OLAP cubes and provide efficient access; virtually
providing the capability that specialized OLAP engines provide. The ORC
format also provides indexing but it seems not multi-level indexing.
Another promising use is for secondary indexing. Basically, making the
file format competitive with NoSQL systems that support secondary indexes.
On Wed, Feb 7, 2018 at 2:08 PM, Ted Dunning <te...@gmail.com> wrote:
> Carbondata does look very cool, but I haven't seen any significant user
> adoption which means that I haven't heard very many war stories.
>
>
>
> On Wed, Feb 7, 2018 at 11:58 AM, Saurabh Mahapatra <
> saurabhmahapatra94@gmail.com> wrote:
>
> > ...
> > The Carbondata project looks quite promising.
> >
> > Any thoughts on what file format you prefer?
> >
>
Re: Which Hadoop File Format Should I Use?
Posted by Ted Dunning <te...@gmail.com>.
Carbondata does look very cool, but I haven't seen any significant user
adoption which means that I haven't heard very many war stories.
On Wed, Feb 7, 2018 at 11:58 AM, Saurabh Mahapatra <
saurabhmahapatra94@gmail.com> wrote:
> ...
> The Carbondata project looks quite promising.
>
> Any thoughts on what file format you prefer?
>