You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "61yao (via GitHub)" <gi...@apache.org> on 2023/03/29 00:48:53 UTC
[GitHub] [pinot] 61yao opened a new issue, #10498: [Performance] Group By Optimization
61yao opened a new issue, #10498:
URL: https://github.com/apache/pinot/issues/10498
We notice high contention and latency for query with large group by.
We will take this opportunity to
1) Reduce high contention
2) Improve cpu usage
3) Reduce latency
4) Improve parallelism
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [I] [Performance] Large Group By Optimization [pinot]
Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on issue #10498:
URL: https://github.com/apache/pinot/issues/10498#issuecomment-1789340981
2 issues i think
1. when multiple threads merging all group-by results into a relatively low cardinality group set ( e.g. num-of-group ~= num-of-thread) causes concurrent index map lock/unlock blocking each other
2. when the group-by results are merged into a relatively large cardinality group set (e.g. num-of-group ~= num-of-row), the overhead for managing each group in the concurrent index map becomes significant
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [I] [Performance] Large Group By Optimization [pinot]
Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10498:
URL: https://github.com/apache/pinot/issues/10498#issuecomment-1789328831
The major hotspot (bottleneck) is the step of multiple threads merging all group-by results into a single indexed table
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] walterddr commented on issue #10498: [Performance] Group By Optimization
Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on issue #10498:
URL: https://github.com/apache/pinot/issues/10498#issuecomment-1492648937
Do we have any detailed SQL and data example on the problem statement of this optimization. Specifically, for "large group-by": Are we specifically targeting high cardinality group set or low cardinality but large volume dataset?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org