You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2022/04/05 23:27:00 UTC

[jira] [Updated] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()

     [ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris M. Hostetter updated LUCENE-10292:
----------------------------------------
    Attachment: LUCENE-10292.patch
      Assignee: Chris M. Hostetter
        Status: Open  (was: Open)

{quote}seems like it should be possible/better to change {{AnalyzingInfixSuggester.build()}} so that the {{searcherMgr}} is only repalced *after* we build the new index (and/or stop using a new {{IndexWriter}} on every {{AnalyzingInfixSuggester.build()}} call and just do a {{writer.deleteAll()}} instead?)
{quote}

Digging into the code a bit more i realized:
* Using {{writer.deleteAll()}} isn't really a viable "fix" because (by default) the writer doesn't exist (or is closed after {{build()}} 
* Another aspect of the existing impl is that if thread T1 enters the {{lookup()}} method and makes it past the {{if (searcherMgr == null)}} check, but then another thread T2 calls {{build()}} before the T1 reaches the synchronized block, then T1 will be blocked for the entire durration of the {{build()}} ... which is certainly less then ideal.

I worked up a patch with a test case to try and  demonstrate the problem, as well as a fix -- note that as written the test case won't always  "fail" cleaning with the existing impl, it can also deadlock because of how I'm using a Semaphore to "slow" down the {{build()}} -- triggering the blocked lookup thread situation described above.

The basic idea of the fix s to decouple the synchronization/locking needed for changing the writer from the synchronization/locking needed to change the SearchManager.  and replace the SearchManager only once the {{build()}} is complete. 

 I originally tried to replace the "R/W" locking of SearcherManager with an AtomicReference so we wouldn't need to have any synchronization blocks in {{lookup()}} at all; but I couldn't figure out a "safe" way to do that w/o ref counting the SearcherManager itself (to ensure that no build/add calls close the SearcherManager that {{lookup()}} gets a ref to before it has a change to {{acquire()}} on it (thank you {{testRandomNRT}} for catching that for me)

Feedback on this approach (or suggestions for better ways to solve this) welcome

> AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-10292
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10292
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>         Attachments: LUCENE-10292.patch
>
>
> I'm filing this based on anecdotal information from a Solr user w/o experiencing it first hand (and I don't have a test case to demonstrate it) but based on a reading of the code the underlying problem seems self evident...
> With all other Lookup implementations I've examined, it is possible to call {{lookup()}} regardless of whether another thread is concurrently calling {{build()}} – in all cases I've seen, it is even possible to call {{lookup()}} even if {{build()}} has never been called: the result is just an "empty" {{List<LookupResult>}} 
> Typically this is works because the {{build()}} method uses temporary datastructures until it's "build logic" is complete, at which point it atomically replaces the datastructures used by the {{lookup()}} method.   In the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method starts by closing & null'ing out the {{protected SearcherManager searcherMgr}} (which it only populates again once it's completed building up it's index) and then the lookup method starts with...
> {code:java}
>     if (searcherMgr == null) {
>       throw new IllegalStateException("suggester was not built");
>     }
> {code}
> ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any situation where another thread may be calling {{AnalyzingInfixSuggester.build()}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org