You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/12/03 08:31:30 UTC

[GitHub] [incubator-druid] clintropolis opened a new pull request #8981: add query metrics for broker parallel merges, off by default

clintropolis opened a new pull request #8981: add query metrics for broker parallel merges, off by default
URL: https://github.com/apache/incubator-druid/pull/8981
 
 
   ### Description
   
   This PR is a follow-up to #8578, adding a handful of query metrics that I believe are interesting, but taking the conservative approach in that all of them are off by default, meaning a custom extension implementing `QueryMetrics` is necessary to actually emit them. `ParallelMergeCombiningSequence` is in `druid-core` where as `QueryMetrics` and friends are in `druid-processing`, so this is done mechanically via a `Consumer<ParallelMergeCombiningSequence.MergeCombineMetrics>` that is supplied to to the sequence, where `ParallelMergeCombiningSequence.MergeCombineMetrics` is the type that all of the metrics from the fork join tasks are accumulated. This allows the consumer, `CachingClusteredClient` in our case, to define how to report the metrics.
   
   I did not document these metrics because they are off by default, and don't want to give operators false hope before they get in deep and realize they need to make a custom extension or whatever. This omission is in favor of someday refining this process so that maybe we bundle several different 'profiles' to enabling emitting a set of metrics based on the profile, or some other system that allows operators to customize without a custom extension.
   
   Also absent are any sort of aggregate metrics, such as pool utilization over some periodic collection interval or whatever. I would like to look into this as a follow-up, since it seems like there are potentially many metrics that would maybe make sense to collect like this, so I'd like to think a bit harder about how to do this so it's not a one-off solution.
   
   This PR also modifies the parallel merge config to disable using the fork join pool if the computed level of pool parallelism isn't more than 2, since you need at least 3 tasks to do the 2 layer merge in parallel.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths.
   
   ##### Key changed/added classes in this PR
    * `ParallelMergeCombiningSequence`
    * `QueryMetrics`
   * `DruidProcessingConfig`
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org