You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/24 22:24:58 UTC

[GitHub] [incubator-druid] himanshug commented on issue #6743: IncrementalIndex generally overestimates theta sketch size

himanshug commented on issue #6743: IncrementalIndex generally overestimates theta sketch size
URL: https://github.com/apache/incubator-druid/issues/6743#issuecomment-486450735
 
 
   @leerho It appears that postgres does have a memory allocator in order to provide the "palloc" and "pfree" methods . @gianm was suggesting something similar. In that case DS library would allow some way of passing those functions . Druid(or other users of DS) would implement the memory allocator  in the way that makes most sense for them (e.g. allocating a big chunk of memory at startup and then giving off chunks from this in "palloc" or delegate each "palloc" to underlying jvm heap or os ...)
   I looked into this a long time ago and one way was hacking it was to use "MemoryRegion" and "MemoryRequest" as in https://github.com/himanshug/druid/blob/growable_aggregator_final/extensions/datasketches/src/main/java/io/druid/query/aggregation/datasketches/theta/SketchResizableBufferAggregator.java#L120 (as you might guess this is based on pretty old version of DS library :) ) .
   
   @gianm for IncrementalIndex , if above is done, simplest would be to use BufferAggregator and it would be more accurate as well than trying to do sizeOf(aggregator) . Current implementation to spill based on `getMaxIntermediateSize()` is puzzling to me as the number returned there is totally unrelated to what smallest/current/largest heap utilization of on-heap Aggregator would be. That number is only relevant when BufferAggregator is used.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org