You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/09 16:34:36 UTC

[GitHub] [incubator-druid] Dylan1312 commented on issue #6814: [Discuss] Replacing hyperUnique as 'default' distinct count sketch

Dylan1312 commented on issue #6814: [Discuss] Replacing hyperUnique as 'default' distinct count sketch
URL: https://github.com/apache/incubator-druid/issues/6814#issuecomment-481326881
 
 
   We've been evaluating the Datasketches implementation of HLL versus Druid's own implementation. 
   
   In early testing we've noticed that performance in the datasketches version seem significantly worse with various configurations.
   
   Curious if anyone else has seem similar results? When aggregating from a column of HLL values it looks like proportionally a lot more time is spent on:
   
   `org.apache.druid.query.aggregation.datasketches.hll.HllSketchObjectStrategy.fromByteBuffer`
   
   Maybe the benchmarks don't consider deserialisation time hence why we see contradictory results? Happy to put together more detailed information.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org