You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Ben Manes (Jira)" <ji...@apache.org> on 2019/11/03 19:41:00 UTC

[jira] [Commented] (LUCENE-8213) Cache costly subqueries asynchronously

    [ https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965724#comment-16965724 ] 

Ben Manes commented on LUCENE-8213:
-----------------------------------

Sorry to distract. I am trying to understand {{LRUQueryCache}} as it has a lot of excellent engineering invested into it. For asynchronous caching without duplicate loads, wouldn't a `CompletableFuture` as the cached value be suitable? The cache doesn't handle the cache stampedes, so the same entry may be computed twice with the first to insert as the winner, and the other threads dropping their work. Ideally by using memoization the other threads would defer to the first and avoid the costly operation. Similarly the cache lookup is skipped if the global lock is held and the value computed, but not stored, which means a busy lock reduces performance across all usages.

Would there be a benefit/liability in using a dedicated caching library like Caffeine? That does handle concurrency in a smoother fashion, which could be beneficial here. I am also curious if the hit ratios of LRU could be improved upon as search problems tend to have a strong frequency bias.

> Cache costly subqueries asynchronously
> --------------------------------------
>
>                 Key: LUCENE-8213
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8213
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>    Affects Versions: 7.2.1
>            Reporter: Amir Hadadi
>            Priority: Minor
>              Labels: performance
>         Attachments: 0001-Reproduce-across-segment-caching-of-same-query.patch, thetaphi_Lucene-Solr-master-Linux_24839.log.txt
>
>          Time Spent: 20h 20m
>  Remaining Estimate: 0h
>
> IndexOrDocValuesQuery allows to combine costly range queries with a selective lead iterator in an optimized way. However, the range query at some point gets cached by a querying thread in LRUQueryCache, which negates the optimization of IndexOrDocValuesQuery for that specific query.
> It would be nice to see an asynchronous caching implementation in such cases, so that queries involving IndexOrDocValuesQuery would have consistent performance characteristics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org