You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/03/12 19:15:02 UTC

[GitHub] [incubator-druid] peferron commented on issue #7236: Further improve caching documentation.

peferron commented on issue #7236: Further improve caching documentation.
URL: https://github.com/apache/incubator-druid/pull/7236#issuecomment-472143133
 
 
   @gianm I'm glad to see that the original PR led to a comprehensive rewrite of the caching doc :)
   
   What I'm curious about is the performance of broker merging in a fully cached scenario. Imagine that all segment results are available in memcached. Then what's the threshold where a single broker getting all results in bulk from the cache then merging them all, starts falling behind farming it out to historicals? Historical caching has its own issues, such as inability of getting results in bulk, and lower local cache hit ratio when replication > 1, so I have a feeling that this threshold could actually be quite high when merging is cheap (simple sums, no HLLs, etc). It's probably really hard to write a good rule of thumb for this in the doc though, since it depends on many factors such as # of historicals, # of segments, result size, result merging cost, etc.
   
   What could be useful, though, is a curated list of holistic caching setups for the entire cluster. It's easy to get lost between all individual settings for using/populating local/remote/hybrid-L1/hybrid-L2 segment-level caches in the brokers, historicals, and MMs. But probably only a handful of distinct combinations of these settings make sense.
   
   For example, a simple setup that would be a good starting point for most users would be:
   
   - Broker: `useResultLevelCache = true`, `populateResultLevelCache = true`
   - Historical: `useCache = true`, `populateCache = true`
   - And everything else to default.
   
   A handful of other setups may make sense in more specific scenarios. The discussion in https://github.com/apache/incubator-druid/issues/4947 contains quite elaborate schemes.
   
   I'm not familiar enough with caching perf at the moment to write such a list of recommended setups, but I intend to benchmark a few of these setups later so perhaps I'll give it a shot after that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org