You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/06/27 05:08:52 UTC

[GitHub] [incubator-pinot] siddharthteotia commented on issue #4372: Support for single phase and two-phase distributed hash aggregation

siddharthteotia commented on issue #4372: Support for single phase and two-phase distributed hash aggregation
URL: https://github.com/apache/incubator-pinot/issues/4372#issuecomment-506191736

@kishoreg , thanks for your response. My understanding was that there was no reduction happening until data from each processing node was sent to the broker. I mistakenly thought that Combine operator doesn't come into picture up until broker. Thanks for clarifying that.

So the broker (1 reducer) will do the final aggregation processing by combining the data across nodes. Right?

Another reason behind thinking of two-phase was that for low cardinality but high volume data set, if we can reduce everything at the segment server level (with a shuffle), then less data will be sent to broker. All of the execution specific operator logic can be kept in the segment server and broker need not worry about running out of memory and crashing while doing the final processing -- it can still while doing the union but there we can just use limit. Similarly, the segment servers can support spilling etc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org