You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Yannick Welsch (Jira)" <ji...@apache.org> on 2021/11/15 17:04:00 UTC

[jira] [Created] (LUCENE-10235) LRUQueryCache should not count never-cacheable queries as a miss

Yannick Welsch created LUCENE-10235:
---------------------------------------

             Summary: LRUQueryCache should not count never-cacheable queries as a miss
                 Key: LUCENE-10235
                 URL: https://issues.apache.org/jira/browse/LUCENE-10235
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Yannick Welsch


Hit and miss counts of a cache are typically used to check how effective a caching layer is. While looking at a system that exhibited a very high miss to hit ratio, I took a closer look at Lucene's LRUQueryCache and noticed that it's treating the handling of queries as a miss that it would never ever even think about caching in the first place. (e.g. TermQuery and others mentioned in UsageTrackingQueryCachingPolicy.shouldNeverCache).

The reason these are counted as a miss is that LRUQueryCache (scorerSupplier and bulkScorer methods) first does a lookup on the cache, incrementing hit or miss counters, and upon miss, only then checks QueryCachingPolicy.shouldCache to decide whether that query should be put into the cache.

This issue is made more complex by the fact that QueryCachingPolicy.shouldCache is a stateful method, and cacheability of a query can change over time (e.g. after appearing N times).

I'm opening this issue to discuss whether others also feel that the current way of accounting misses is unintuitive / confusing. I would also like to put forward a proposal to:
 * generalize the boolean QueryCachingPolicy.shouldCache method to return an enum instead (one of YES, NOT_RIGHT_NOW, NEVER), and only account queries that are (eventually) cacheable and not in the cache as a miss,
 * optionally introduce another metric for queries that are never cacheable, e.g. "ignored", and
 * optionally refine miss count into a count for items that are cacheable right away, and those that will eventually be cacheable.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org