You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Paweł Bugalski (Jira)" <ji...@apache.org> on 2021/02/19 12:46:00 UTC

[jira] [Commented] (LUCENE-9791) Monitor (aka Luwak) has concurrency issues related to BytesRefHash#find

    [ https://issues.apache.org/jira/browse/LUCENE-9791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287052#comment-17287052 ] 

Paweł Bugalski commented on LUCENE-9791:
----------------------------------------

I'm attaching a proposed patch that makes _org.apache.lucene.util.BytesRefHash#find_ and as a result _org.apache.lucene.monitor.Monitor_ thread safe.The patch is replacing the _scratch1_ field with local variable.Some considerations: This patch does not change the single threaded behaviour of the _BytesRefHash_ class. I think _scratch1_ field was an attempt to optimize performance by reducing the number of allocations ([~simonw] maybe you could confirm that). I've checked that once C2 compiler inlines find method it will remove new BytesRef allocation introduced by the patch.

[^LUCENE-9791.patch]

> Monitor (aka Luwak) has concurrency issues related to BytesRefHash#find
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-9791
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9791
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/other
>    Affects Versions: master (9.0), 8.7, 8.8
>            Reporter: Paweł Bugalski
>            Priority: Major
>         Attachments: LUCENE-9791.patch
>
>
> _org.apache.lucene.monitor.Monitor_ can sometimes *NOT* match a document that should be matched by one of registered queries if match operations are run concurrently from multiple threads. 
> This is because sometimes in a concurrent environment _TermFilteredPresearcher_ might not select a query that could later on match one of documents being matched.
> Internally _TermFilteredPresearcher_ is using a term acceptor: an instance of _org.apache.lucene.monitor.QueryIndex.QueryTermFilter_. _QueryTermFilter_ is correctly initialized under lock and its internal state (a map of _org.apache.lucene.util.BytesRefHash_ instances) is correctly published. Later one when those instances are used concurrently a problem with _org.apache.lucene.util.BytesRefHash#find_ is triggered since it is not thread safe.
> _org.apache.lucene.util.BytesRefHash#find_ internally is using a private _org.apache.lucene.util.BytesRefHash#equals_ method, which is using an instance field _scratch1_ as a temporary buffer to compare its _ByteRef_ parameter with contents of _ByteBlockPool_. This is not thread safe and can cause incorrect answers as well as _ArrayOutOfBoundException_. 
> __
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org