You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/11/08 08:39:51 UTC

[GitHub] clintropolis opened a new pull request #6588: autosize processing buffers based on direct memory sizing by default

clintropolis opened a new pull request #6588: autosize processing buffers based on direct memory sizing by default
URL: https://github.com/apache/incubator-druid/pull/6588
 
 
   This PR modifies `DruidProcessingConfig` property `druid.processing.buffer.sizeBytes` to compute a reasonable default based on the amount of direct memory, number of processing threads, and number of merge buffers instead of using a fixed 1GiB default buffer size. This should be much more friendly behavior out of the box, ensuring reasonably efficient usage of direct memory resources provided to the process, without interfering with operators who still wish to fine tune such things.
   
   On process startup, `DruidProcessingModule` does a check:
   
   ```
   memoryNeeded = druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)
   ```
   
   to validate that the process has been given enough direct memory. Configs for `numThreads` and `numMergeBuffers` produce reasonable defaults if not manually set, but `sizeBytes` has a fixed default size of 1G, which may or may not work depending on the direct memory settings and core count. This formula is shifted around to produce a default value for `sizeBytes`. I'm not certain this is actually the optimal formula, since having a lot of merge buffers effectively eats into the amount of space reserved for things like decompressing blocks of segments, while the increased number of merge buffers increases the processing throughput of simultaneous group-by queries, which is a sort of conflict, but I think adjustments can be made in a future PR.
   
   changes:
   * `DruidProcessingConfig.intermediateComputeSizeBytes()` now computes a default value based on `-XX:MaxDirectMemorySize`
   * Introduces `RuntimeInfo` class that wraps `Runtime.getRuntime()` to expose available processors and memory sizing information, mostly to allow control over these things in unit tests without setting flags on the jvm process, but it also nicely consolidates this stuff I guess?
   * `org.apache.druid.common.utils.VMUtils` has been moved and renamed `org.apache.druid.utils.JvmUtils` which has a static injected `RuntimeInfo` which is used in all Druid sources over calls to `Runtime.getRuntime()` methods.
   * Introduces `RuntimeInfoModule` to default injector to inject `RuntimeInfo` into `JvmUtils`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org