You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/01/03 19:47:54 UTC

[GitHub] gianm commented on issue #6795: Different groupBy strategies for ThetaSketch and HyperUnique?

gianm commented on issue #6795: Different groupBy strategies for ThetaSketch and HyperUnique? 
URL: https://github.com/apache/incubator-druid/issues/6795#issuecomment-451256186
 
 
   Hi @csimplestring,
   
   This is possibly related to #6743 and may come from the same underlying problem: the theta sketch has a relatively large maximum size, even though it's not typically going to use it all. This causes IncrementalIndex to overestimate its size and it also causes groupBy to reserve a lot more space for it than truly necessary. In turn- this can cause inefficiency and potentially slow down the query.
   
   Could you try running the query with `"groupByStrategy": "v1"`? That one uses on-heap aggregators and so it doesn't suffer from the same problem. If it's faster, it probably means my guess above is correct about wha's happening.
   
   This is one of the only situations where groupBy v1 is better than v2 (when using an aggregation type that has a large maximum size but a small typical size). Possibly the only situation at all.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org