You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by Azuryy Yu <az...@gmail.com> on 2012/09/23 03:37:37 UTC

Record assemble or not?

Hi all,

I just put forward this question, because some discussion in this list
mentioned this issue.

but I don't think record assemble can improve the performance, actually we
just consume the column-oriented record directly. because record assemble
need to parse and generate the necessary data for the result. and it cannot
avoid skip other columns it maybe not used in the query.

and in the Dremel paper, they also wrote in the Observation section: they
should try to consume column data directly.


Thanks.

Re: Record assemble or not?

Posted by Ted Dunning <te...@gmail.com>.

I am not quite sure what you mean here, but it is quite clear that it is
important in a column oriented world to delay record assembly and only do
as much as necessary to avoid unnecessary data structure construction
overhead.  If you can determine that you don't want to consider a record at
all, then eliminating without reading all other fields is a big win.  If
you don't need certain fields in your output at all, then avoiding an
assembly that includes those records is a big win.

On Sat, Sep 22, 2012 at 8:37 PM, Azuryy Yu <az...@gmail.com> wrote:

> Hi all,
>
> I just put forward this question, because some discussion in this list
> mentioned this issue.
>
> but I don't think record assemble can improve the performance, actually we
> just consume the column-oriented record directly. because record assemble
> need to parse and generate the necessary data for the result. and it cannot
> avoid skip other columns it maybe not used in the query.
>
> and in the Dremel paper, they also wrote in the Observation section: they
> should try to consume column data directly.
>
>
> Thanks.
>