You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/07/16 10:32:09 UTC

[GitHub] [incubator-druid] clintropolis opened a new pull request #8089: add CachingClusteredClient benchmark, refactor some stuff

clintropolis opened a new pull request #8089: add CachingClusteredClient benchmark, refactor some stuff
URL: https://github.com/apache/incubator-druid/pull/8089
 
 
   ### Description
   
   This PR adds a benchmark for `CachingClusteredClient` and some refactoring of the query processing pipeline to provide the foundation for testing approaches to parallel broker merges.
   
   Benchmarks can be run with a command like the following:
   
   ```
   java -Ddruid.benchmark.cacheDir=./tmp/benches/ -jar benchmarks/target/benchmarks.jar org.apache.druid.benchmark.query.CachingClusteredClientBenchmark
   ```
   
   Substituting benchmark cache directory as appropriate.
   
   #### Background
   I'm having a go at parallel broker merges, making another attempt to achieve the goals of #5913 and #6629, eventually planning to attempt the `ForkJoinPool` in `asyncMode` approach suggested by @leventov in [this thread](https://github.com/apache/incubator-druid/pull/6629#discussion_r241089247). Before that, in order to untangle things a bit, I've taken the benchmarks from #6629 (credit to @jihoonson) and updated/simplified them to take advantage of some of the changes to `SegmentGenerator` from #6794, to allow a persistent cache for the generated benchmark segments for much faster benchmarking. I've also extracted some of the useful refactorings and got a bit more adventurous. This should help isolate these supporting changes from any future PR which adds parallel merging, reducing review overhead.
   
   #### Refactoring
   
   ##### `CombiningFunction<T>`
   Added `CombiningFunction<T>`, a new `@FunctionalInterface` to replace `BinaryFn<Type1, Type2, OutType>`, since all actual usages were of the form `BinaryFn<T, T, T>` and being strictly used in merging sequences/iterators/iterables, etc.
   
   ##### `QueryToolChest` and `ResultMergeQueryRunner`
   In order to split out the mechanisms useful during merge from the merge implementation, `QueryToolChest` now has 2 additional functions:
   
   ```
   CombiningFunction<ResultType> createMergeFn(Query<ResultType> query)
   ```
   and
   ```
   Ordering<ResultType> createOrderingFn(Query<ResultType> query)
   ```
   
   For group-by queries, `GroupByStrategy` also has these method signatures, since `GroupByQueryToolchest` is delegating these things to the strategy.
   
   These methods are passed into a refactored, non-abstract `ResultMergeQueryRunner`, as function generators, that given a `Query` produce either a `CombiningFunction` or `Ordering` respectively.
   
   ##### `ConnectionCountServerSelectorStrategy` is now `WeightedServerSelectorStrategy`
   I did not refactor `QueryableDruidServer` in quite the same manner as #6629, but I did still modify `QueryableDruidServer` and `QueryRunner` to add a `getWeight` method, as suggested by @drcrallen in [this comment thread](https://github.com/apache/incubator-druid/pull/6629#discussion_r240789022) to make the selector strategy a bit more generic instead of hard casting `QueryRunner` to a `DirectDruidClient` to get the number of connections.
   
   #### Removed
   `OrderedMergingIterator`, `OrderedMergingSequence`, and `SortingMergeIterator` have been removed, since they were strictly used by their tests.
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org