You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Dawid Weiss (Jira)" <ji...@apache.org> on 2020/03/18 09:05:00 UTC

[jira] [Created] (LUCENE-9282) Surround query parser's Query instances accumulate clause count on rewrite() causing TooManyBasicQueries and hashCode/equals changes

Dawid Weiss created LUCENE-9282:
-----------------------------------

             Summary: Surround query parser's Query instances accumulate clause count on rewrite() causing TooManyBasicQueries and hashCode/equals changes
                 Key: LUCENE-9282
                 URL: https://issues.apache.org/jira/browse/LUCENE-9282
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Dawid Weiss
            Assignee: Dawid Weiss


I was surprised to discover that queries produced by the surround query parser (span queries) behave in a non-deterministic way over multiple calls to IndexSearcher.search() methods.

The problem is that SQP produces classes referencing an internal BasicQueryFactory that accumulates primitive clause count on rewrite, throwing TooManyBasicQueries if a given threshold is exceeded. This is fine but leads to an odd situation in which a loop like this:

{code:java}
Query q = QueryParser.parse("...") 
for (int i = 0; i < 10000; i++) {
  indexSearcher.search(q, 10);
}
{code}

would execute the query successfully up until the threshold is reached, only after that throwing an exception. What's even weirder, the hashCode/ equals changes on q over time, disrespecting the Query class contract:

https://github.com/apache/lucene-solr/blob/fbd05167f455e3ce2b2ead50336e2b9c2521cd6c/lucene/queryparser/src/java/org/apache/lucene/queryparser/surround/query/RewriteQuery.java#L67

and:

https://github.com/apache/lucene-solr/blob/fbd05167f455e3ce2b2ead50336e2b9c2521cd6c/lucene/queryparser/src/java/org/apache/lucene/queryparser/surround/query/BasicQueryFactory.java#L76-L91

This seems like a bug to me but I wanted to make sure if this behavior is relied on anywhere? 

My take on fixing this would be to pass an independent query counter inside rewrite so that the original Query itself remains immutable (including hash code and equals) and rewrite can be called any number of times (always with the same result).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org