You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/25 00:23:24 UTC

[GitHub] [incubator-druid] gianm commented on issue #6743: IncrementalIndex generally overestimates theta sketch size

gianm commented on issue #6743: IncrementalIndex generally overestimates theta sketch size
URL: https://github.com/apache/incubator-druid/issues/6743#issuecomment-486474348
 
 
   I like the idea of using a chunk of the processing buffer (since it already exists, and because it doesn't incur any JVM heap operations or put any pressure on the GC) but I don't like the idea of trying to rebuild a fancy memory allocator. So I'm starting to wonder how simple of an allocator we can get away with.
   
   I wonder how bad it would be if aggregators were never allowed to free memory. Basically, let them allocate whatever chunks they want, and free it automatically at the end of the query, but don't let them free it. And if the total allocated memory exceeds the memory limit for a query, then fail the query.
   
   > @gianm for IncrementalIndex , if above is done, simplest would be to use BufferAggregator and it would be more accurate as well than trying to do sizeOf(aggregator) . Current implementation to spill based on getMaxIntermediateSize() is puzzling to me as the number returned there is totally unrelated to what smallest/current/largest heap utilization of on-heap Aggregator would be. That number is only relevant when BufferAggregator is used.
   
   IIRC we talked a bit about this on the PR - yeah, it is a non-sequitur, but:
   
   1. It's likely that the `getMaxIntermediateSize` for buffer aggregators is also a rough, but reasonble cap on how much on-heap memory might be used by an on-heap aggregator.
   2. We considered moving IncrementalIndex's aggregators off-heap out of scope at the time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org