You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shay Banon (Created) (JIRA)" <ji...@apache.org> on 2011/11/29 20:21:40 UTC

[jira] [Created] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
-------------------------------------------------------------------------------------------

                 Key: LUCENE-3609
                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
             Project: Lucene - Java
          Issue Type: Bug
          Components: core/search
    Affects Versions: 3.5
            Reporter: Shay Banon


The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.

For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.

The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3609:
----------------------------------

    Attachment: LUCENE-3609.patch

Here the fix with short-circuit.
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3609.patch, LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159970#comment-13159970 ] 

Uwe Schindler commented on LUCENE-3609:
---------------------------------------

Committed 3.x revision: 1208375

Now forward-porting
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3609.patch, LUCENE-3609.patch, LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3609:
----------------------------------

    Attachment: LUCENE-3609.patch

This is the easy patch. We still need a test, but it fixes the behaviour change.
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler reassigned LUCENE-3609:
-------------------------------------

    Assignee: Uwe Schindler

I will look into this.
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159480#comment-13159480 ] 

Robert Muir commented on LUCENE-3609:
-------------------------------------

{quote}
For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
{quote}

Wait, this sounds correct.

If you have a MUST clause and a SHOULD clause, then the SHOULD clause is totally irrelevant (from boolean logic).

                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3609:
----------------------------------

    Attachment: LUCENE-3609.patch
    
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3609.patch, LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Shay Banon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159526#comment-13159526 ] 

Shay Banon commented on LUCENE-3609:
------------------------------------

I don't think this is the best fix, since null values for empty values allows for early exit and less processing (not sure why the bool filter does not return null if it match nothing). Why not just implement the fix I suggested?
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159524#comment-13159524 ] 

Uwe Schindler commented on LUCENE-3609:
---------------------------------------

I agree there is something wrong. The filter logic should change to return DocIdSet.EMPTY_DOCIDSET.iterator() in getDISI. The null check can then go and the behaviour is correct again.
The problem here only occurs if a filter returns the emoty instance or null. If it returns an empty BitSet it behaves as before.
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Shay Banon (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159512#comment-13159512 ] 

Shay Banon edited comment on LUCENE-3609 at 11/29/11 8:22 PM:
--------------------------------------------------------------

What I am saying is that BooleanFilter used to act in a way that at least one should clause should match, and it is no longer the case. Here is the logic that was before:

{code}
if (shouldFilters != null) {
      for (int i = 0; i < shouldFilters.size(); i++) {
        if (res == null) {
          res = new OpenBitSetDISI(getDISI(shouldFilters, i, reader), reader.maxDoc());
        } else { 
          DocIdSet dis = shouldFilters.get(i).getDocIdSet(reader);
          if(dis instanceof OpenBitSet) {
            // optimized case for OpenBitSets
            res.or((OpenBitSet) dis);
          } else {
            res.inPlaceOr(getDISI(shouldFilters, i, reader));
          }
        }
      }
    }
{code}
    
Assuming the getDISI returns EMTY iterator for a filter that does not match (and not null, as it will fail) for a single should clause, then the result of this will be a "res" all "zeroed" out (the first check on res==null). Then, if it went ahead and executed a must clause, it would and on a "zeroed" out bitset and the result is no matches.

Now, with the change, we have this code:

{code}
for (final FilterClause fc : clauses) {
  if (fc.getOccur() == Occur.SHOULD) {
    final DocIdSetIterator disi = getDISI(fc.getFilter(), reader);
    if (disi == null) continue;
    if (res == null) {
      res = new FixedBitSet(reader.maxDoc());
    }
    res.or(disi);
  }
}
{code}

The result of a single should clause that does not match anything is a res still set to null, and then, when it gets to the must clause, it will or it with the result of the must clause, and return the docs that match the must clause. You can see this is different compared to the previous behavior and actually, different than the expected behavior.

[Update]: And the fix should be to return null only if res is null and should clauses count is higher than 0 after the check for should clause count.
                
      was (Author: kimchy):
    What I am saying is that BooleanFilter used to act in a way that at least one should clause should match, and it is no longer the case. Here is the logic that was before:

{code}
if (shouldFilters != null) {
      for (int i = 0; i < shouldFilters.size(); i++) {
        if (res == null) {
          res = new OpenBitSetDISI(getDISI(shouldFilters, i, reader), reader.maxDoc());
        } else { 
          DocIdSet dis = shouldFilters.get(i).getDocIdSet(reader);
          if(dis instanceof OpenBitSet) {
            // optimized case for OpenBitSets
            res.or((OpenBitSet) dis);
          } else {
            res.inPlaceOr(getDISI(shouldFilters, i, reader));
          }
        }
      }
    }
{code}
    
Assuming the getDISI returns EMTY iterator for a filter that does not match (and not null, as it will fail) for a single should clause, then the result of this will be a "res" all "zeroed" out (the first check on res==null). Then, if it went ahead and executed a must clause, it would and on a "zeroed" out bitset and the result is no matches.

Now, with the change, we have this code:

{code}
for (final FilterClause fc : clauses) {
  if (fc.getOccur() == Occur.SHOULD) {
    final DocIdSetIterator disi = getDISI(fc.getFilter(), reader);
    if (disi == null) continue;
    if (res == null) {
      res = new FixedBitSet(reader.maxDoc());
    }
    res.or(disi);
  }
}
{code}

The result of a single should clause that does not match anything is a res still set to null, and then, when it gets to the must clause, it will or it with the result of the must clause, and return the docs that match the must clause. You can see this is different compared to the previous behavior and actually, different than the expected behavior.
                  
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159485#comment-13159485 ] 

Uwe Schindler commented on LUCENE-3609:
---------------------------------------

Shay,

there was no change caused by LUCENE-3446 or LUCENE-3458, the logic is identical before and after. To be sure I will write a test but if you look at the patch it will not change behaviour. The minShouldMatch logic was never implemented in BooleanFilter.
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-3609.
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0
                   3.6

Committed trunk revision: 1208381
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3609.patch, LUCENE-3609.patch, LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3609:
----------------------------------

    Attachment:     (was: LUCENE-3609.patch)
    
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3609.patch, LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3609:
----------------------------------

    Attachment: LUCENE-3609.patch

Path with testcase testing all special cases (all patches for 3.x)
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3609.patch, LUCENE-3609.patch, LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Shay Banon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159512#comment-13159512 ] 

Shay Banon commented on LUCENE-3609:
------------------------------------

What I am saying is that BooleanFilter used to act in a way that at least one should clause should match, and it is no longer the case. Here is the logic that was before:

{code}
if (shouldFilters != null) {
      for (int i = 0; i < shouldFilters.size(); i++) {
        if (res == null) {
          res = new OpenBitSetDISI(getDISI(shouldFilters, i, reader), reader.maxDoc());
        } else { 
          DocIdSet dis = shouldFilters.get(i).getDocIdSet(reader);
          if(dis instanceof OpenBitSet) {
            // optimized case for OpenBitSets
            res.or((OpenBitSet) dis);
          } else {
            res.inPlaceOr(getDISI(shouldFilters, i, reader));
          }
        }
      }
    }
{code}
    
Assuming the getDISI returns EMTY iterator for a filter that does not match (and not null, as it will fail) for a single should clause, then the result of this will be a "res" all "zeroed" out (the first check on res==null). Then, if it went ahead and executed a must clause, it would and on a "zeroed" out bitset and the result is no matches.

Now, with the change, we have this code:

{code}
for (final FilterClause fc : clauses) {
  if (fc.getOccur() == Occur.SHOULD) {
    final DocIdSetIterator disi = getDISI(fc.getFilter(), reader);
    if (disi == null) continue;
    if (res == null) {
      res = new FixedBitSet(reader.maxDoc());
    }
    res.or(disi);
  }
}
{code}

The result of a single should clause that does not match anything is a res still set to null, and then, when it gets to the must clause, it will or it with the result of the must clause, and return the docs that match the must clause. You can see this is different compared to the previous behavior and actually, different than the expected behavior.
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159543#comment-13159543 ] 

Uwe Schindler commented on LUCENE-3609:
---------------------------------------

bq. not sure why the bool filter does not return null if it match nothing

This does not matter, processing is the same. DocIdSet.EMPTY_DOCIDSET has same effect like null in Lucene's internals (there are checks handling those special values). In my opinion we should disallow null as return value in filters completely.

The attached patch is the easy fix that does exactly the same like before, but it's indeed less efficient as it would return an empty FixedBitSet. So a shortcut would be nice.

It can of course still happen that a clause returns an empty BitSet, but then the code would still work correct (but without short circuit).
                
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-3609.patch
>
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (LUCENE-3609) BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1

Posted by "Uwe Schindler (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159485#comment-13159485 ] 

Uwe Schindler edited comment on LUCENE-3609 at 11/29/11 8:00 PM:
-----------------------------------------------------------------

Shay,

there was no real change caused by LUCENE-3446 or LUCENE-3458, the logic is identical before and after. To be sure I will write a test but if you look at the patch it will not change behaviour. The minShouldMatch logic was never implemented in BooleanFilter.

There was one small "bug" in the filter before. It handled the case that a filter clause returned null different than the case if the clause returned an empty bitset/DocIdSet.EMPTY_DOCIDSET. So the whole thing was broken before as it was not consistent. Now it behaves exactly as Robert told. 

The minShouldMatch logic was caused by different behaviour on clauses returning null instead DocIdSet.EMPTY_DOCIDSET.
                
      was (Author: thetaphi):
    Shay,

there was no change caused by LUCENE-3446 or LUCENE-3458, the logic is identical before and after. To be sure I will write a test but if you look at the patch it will not change behaviour. The minShouldMatch logic was never implemented in BooleanFilter.
                  
> BooleanFilter changed behavior in 3.5, no longer acts as if "minimum should match" set to 1
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3609
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.5
>            Reporter: Shay Banon
>            Assignee: Uwe Schindler
>
> The change LUCENE-3446 causes a change in behavior in BooleanFilter. It used to work as if minimum should match clauses is 1 (compared to BQ lingo), but now, if no should clauses match, then the should clauses are ignored, and for example, if there is a must clause, only that one will be used and returned.
> For example, a single must clause and should clause, with the should clause not matching anything, should not match anything, but, it will match whatever the must clause matches.
> The fix is simple, after iterating over the should clauses, if the aggregated bitset is null, return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org