You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2010/11/12 14:01:17 UTC

[jira] Created: (LUCENE-2756) MultiSearcher.rewrite() incorrectly rewrites queries

MultiSearcher.rewrite() incorrectly rewrites queries
----------------------------------------------------

                 Key: LUCENE-2756
                 URL: https://issues.apache.org/jira/browse/LUCENE-2756
             Project: Lucene - Java
          Issue Type: Bug
          Components: Search
            Reporter: Robert Muir


This was reported on the userlist, in the context of range queries.

Its also easy to make our existing tests fail with my patch on LUCENE-2751:
{noformat}
ant test-core -Dtestcase=TestBoolean2 -Dtestmethod=testRandomQueries -Dtests.seed=7679849347282878725:-903778383189134045
{noformat}

The fundamental problem is that MultiSearcher first rewrites against individual subs, 
then uses Query.combine() which simply OR's these sub-clauses.

This is incorrect for expanded MUST_NOT queries (e.g. from wildcard), as it violates demorgan's law.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2756) MultiSearcher.rewrite() incorrectly rewrites queries

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-2756.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0
                   3.1

MultiSearcher is now deprecated/removed.

> MultiSearcher.rewrite() incorrectly rewrites queries
> ----------------------------------------------------
>
>                 Key: LUCENE-2756
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2756
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2756_testcase.patch
>
>
> This was reported on the userlist, in the context of range queries.
> Its also easy to make our existing tests fail with my patch on LUCENE-2751:
> {noformat}
> ant test-core -Dtestcase=TestBoolean2 -Dtestmethod=testRandomQueries -Dtests.seed=7679849347282878725:-903778383189134045
> {noformat}
> The fundamental problem is that MultiSearcher first rewrites against individual subs, 
> then uses Query.combine() which simply OR's these sub-clauses.
> This is incorrect for expanded MUST_NOT queries (e.g. from wildcard), as it violates demorgan's law.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2756) MultiSearcher.rewrite() incorrectly rewrites queries

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2756:
--------------------------------

    Attachment: LUCENE-2756_testcase.patch

attached is a simple test, it adds a single document "foo bar" to one index,
and another document "foo baz" to another.

if you do the query "+foo -ba*", the multisearcher rewrites this to:
(+field:foo -field:baz) (+field:foo -field:bar)

This causes both documents to match the query, when really neither should.
instead the query should be (+field:foo -field:baz -field:bar)

if you run the test with -Dtests.verbose=true you can see the rewritten form.

the reason this only appeared with a certain document count for the issue on the
user's list is because they were using CONSTANT_SCORE_AUTO and with that
document count it was deciding to use a constant-score boolean rewrite method.


> MultiSearcher.rewrite() incorrectly rewrites queries
> ----------------------------------------------------
>
>                 Key: LUCENE-2756
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2756
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>            Reporter: Robert Muir
>         Attachments: LUCENE-2756_testcase.patch
>
>
> This was reported on the userlist, in the context of range queries.
> Its also easy to make our existing tests fail with my patch on LUCENE-2751:
> {noformat}
> ant test-core -Dtestcase=TestBoolean2 -Dtestmethod=testRandomQueries -Dtests.seed=7679849347282878725:-903778383189134045
> {noformat}
> The fundamental problem is that MultiSearcher first rewrites against individual subs, 
> then uses Query.combine() which simply OR's these sub-clauses.
> This is incorrect for expanded MUST_NOT queries (e.g. from wildcard), as it violates demorgan's law.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org