You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by nishith agarwal <na...@apache.org> on 2019/05/09 00:19:30 UTC

Supporting Collapse type operation for better data layout

High level requirements :

1. Write larger files while keeping the ingestion & query latencies low
2. Better data layout, for eg.,when rewriting smaller files to larger ones,
piggyback on the I/O and move records around and group them based on some
pattern for better query performance, compression etc..

Created an issue around this :
https://issues.apache.org/jira/browse/HUDI-112

Let's discuss there and then we can follow it up with a HIP.

Thanks,
Nishith

Re: Supporting Collapse type operation for better data layout

Posted by Vinoth Chandar <vi...@apache.org>.
This would be an exciting project!

On Wed, May 8, 2019 at 5:19 PM nishith agarwal <na...@apache.org> wrote:

> High level requirements :
>
> 1. Write larger files while keeping the ingestion & query latencies low
> 2. Better data layout, for eg.,when rewriting smaller files to larger ones,
> piggyback on the I/O and move records around and group them based on some
> pattern for better query performance, compression etc..
>
> Created an issue around this :
> https://issues.apache.org/jira/browse/HUDI-112
>
> Let's discuss there and then we can follow it up with a HIP.
>
> Thanks,
> Nishith
>