You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by rahul challapalli <ch...@gmail.com> on 2017/06/20 20:36:31 UTC

Performance issue with 2 phase hash-agg design

During the first phase, the hash agg operator is not protected from skew in
data (Eg : data contains 2 files where the number of records in one file is
very large compared to the other). Assuming there are only 2 fragments, the
hash-agg operator in one fragment handles more records and it aggregates
until the memory available to it gets exhausted, at which point it sends
the record batches downstream to the hash-partitioner.

Because the hash-partitioner normalizes the skew in the data, the work is
evenly divided and the 2 minor fragments running the second phase
hash-aggregate take similar amount of processing time.

So what is the problem here? During the first phase one minor fragment
takes a long time which affects the runtime of the query. Instead, if the
first phase did not do any aggregation or only used low memory (there by
limiting the aggregations performed) then the query would have completed
faster. However the advantage of doing 2-phase aggregation is reduced
traffic on the network. But if the keys used in group by are mostly unique
then we loose this advantage as well.

I was playing with the new spillable hash-agg code and observed that
increasing memory did not improve the runtime.  This behavior can be
explained by the above reasoning.

Aggregating on mostly unique keys may not be a common use case, but any
thoughts in general about this?

Re: Performance issue with 2 phase hash-agg design

Posted by rahul challapalli <ch...@gmail.com>.
Thanks for sharing the link Aman.

On Tue, Jun 20, 2017 at 3:26 PM, Aman Sinha <am...@apache.org> wrote:

> See [1] which talks about this behavior for unique keys and suggests
> manually setting the single phase agg.
> We would need NDV statistics on the group-by keys to have the optimizer
> pick the more efficient scheme.
>
> [1] https://drill.apache.org/docs/guidelines-for-optimizing-aggregation/
>
> On Tue, Jun 20, 2017 at 2:30 PM, Chun Chang <cc...@mapr.com> wrote:
>
> > I also noticed if the keys are mostly unique, the first phase aggregation
> > effort is mostly wasted. This can and should be improved.
> >
> >
> > One idea is to detect unique keys while processing. When the percentage
> of
> > unique keys exceeds a certain threshold after processing certain
> percentage
> > of data, skip the rest and send directly to downstream second phase
> > aggregation.
> >
> > ________________________________
> > From: rahul challapalli <ch...@gmail.com>
> > Sent: Tuesday, June 20, 2017 1:36:31 PM
> > To: dev
> > Subject: Performance issue with 2 phase hash-agg design
> >
> > During the first phase, the hash agg operator is not protected from skew
> in
> > data (Eg : data contains 2 files where the number of records in one file
> is
> > very large compared to the other). Assuming there are only 2 fragments,
> the
> > hash-agg operator in one fragment handles more records and it aggregates
> > until the memory available to it gets exhausted, at which point it sends
> > the record batches downstream to the hash-partitioner.
> >
> > Because the hash-partitioner normalizes the skew in the data, the work is
> > evenly divided and the 2 minor fragments running the second phase
> > hash-aggregate take similar amount of processing time.
> >
> > So what is the problem here? During the first phase one minor fragment
> > takes a long time which affects the runtime of the query. Instead, if the
> > first phase did not do any aggregation or only used low memory (there by
> > limiting the aggregations performed) then the query would have completed
> > faster. However the advantage of doing 2-phase aggregation is reduced
> > traffic on the network. But if the keys used in group by are mostly
> unique
> > then we loose this advantage as well.
> >
> > I was playing with the new spillable hash-agg code and observed that
> > increasing memory did not improve the runtime.  This behavior can be
> > explained by the above reasoning.
> >
> > Aggregating on mostly unique keys may not be a common use case, but any
> > thoughts in general about this?
> >
>

Re: Performance issue with 2 phase hash-agg design

Posted by Aman Sinha <am...@apache.org>.
See [1] which talks about this behavior for unique keys and suggests
manually setting the single phase agg.
We would need NDV statistics on the group-by keys to have the optimizer
pick the more efficient scheme.

[1] https://drill.apache.org/docs/guidelines-for-optimizing-aggregation/

On Tue, Jun 20, 2017 at 2:30 PM, Chun Chang <cc...@mapr.com> wrote:

> I also noticed if the keys are mostly unique, the first phase aggregation
> effort is mostly wasted. This can and should be improved.
>
>
> One idea is to detect unique keys while processing. When the percentage of
> unique keys exceeds a certain threshold after processing certain percentage
> of data, skip the rest and send directly to downstream second phase
> aggregation.
>
> ________________________________
> From: rahul challapalli <ch...@gmail.com>
> Sent: Tuesday, June 20, 2017 1:36:31 PM
> To: dev
> Subject: Performance issue with 2 phase hash-agg design
>
> During the first phase, the hash agg operator is not protected from skew in
> data (Eg : data contains 2 files where the number of records in one file is
> very large compared to the other). Assuming there are only 2 fragments, the
> hash-agg operator in one fragment handles more records and it aggregates
> until the memory available to it gets exhausted, at which point it sends
> the record batches downstream to the hash-partitioner.
>
> Because the hash-partitioner normalizes the skew in the data, the work is
> evenly divided and the 2 minor fragments running the second phase
> hash-aggregate take similar amount of processing time.
>
> So what is the problem here? During the first phase one minor fragment
> takes a long time which affects the runtime of the query. Instead, if the
> first phase did not do any aggregation or only used low memory (there by
> limiting the aggregations performed) then the query would have completed
> faster. However the advantage of doing 2-phase aggregation is reduced
> traffic on the network. But if the keys used in group by are mostly unique
> then we loose this advantage as well.
>
> I was playing with the new spillable hash-agg code and observed that
> increasing memory did not improve the runtime.  This behavior can be
> explained by the above reasoning.
>
> Aggregating on mostly unique keys may not be a common use case, but any
> thoughts in general about this?
>

Re: Performance issue with 2 phase hash-agg design

Posted by Chun Chang <cc...@mapr.com>.
I also noticed if the keys are mostly unique, the first phase aggregation effort is mostly wasted. This can and should be improved.


One idea is to detect unique keys while processing. When the percentage of unique keys exceeds a certain threshold after processing certain percentage of data, skip the rest and send directly to downstream second phase aggregation.

________________________________
From: rahul challapalli <ch...@gmail.com>
Sent: Tuesday, June 20, 2017 1:36:31 PM
To: dev
Subject: Performance issue with 2 phase hash-agg design

During the first phase, the hash agg operator is not protected from skew in
data (Eg : data contains 2 files where the number of records in one file is
very large compared to the other). Assuming there are only 2 fragments, the
hash-agg operator in one fragment handles more records and it aggregates
until the memory available to it gets exhausted, at which point it sends
the record batches downstream to the hash-partitioner.

Because the hash-partitioner normalizes the skew in the data, the work is
evenly divided and the 2 minor fragments running the second phase
hash-aggregate take similar amount of processing time.

So what is the problem here? During the first phase one minor fragment
takes a long time which affects the runtime of the query. Instead, if the
first phase did not do any aggregation or only used low memory (there by
limiting the aggregations performed) then the query would have completed
faster. However the advantage of doing 2-phase aggregation is reduced
traffic on the network. But if the keys used in group by are mostly unique
then we loose this advantage as well.

I was playing with the new spillable hash-agg code and observed that
increasing memory did not improve the runtime.  This behavior can be
explained by the above reasoning.

Aggregating on mostly unique keys may not be a common use case, but any
thoughts in general about this?