You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hardy Ferentschik (JIRA)" <ji...@apache.org> on 2010/11/03 11:49:23 UTC

[jira] Created: (LUCENE-2736) Wrong implementation of DocIdSetIterator.advance

Wrong implementation of DocIdSetIterator.advance 
-------------------------------------------------

                 Key: LUCENE-2736
                 URL: https://issues.apache.org/jira/browse/LUCENE-2736
             Project: Lucene - Java
          Issue Type: Bug
          Components: Other
    Affects Versions: 3.0.2, 4.0
            Reporter: Hardy Ferentschik


Implementations of {{DocIdSetIterator}} behave differently when advanced is called. Taking the following test for {{OpenBitSet}}, {{DocIdBitSet}} and {{SortedVIntList}} only {{SortedVIntList}} passes the test:
{code:title=org.apache.lucene.search.TestDocIdSet.java|borderStyle=solid}
...
	public void testAdvanceWithOpenBitSet() throws IOException {
		DocIdSet idSet = new OpenBitSet( new long[] { 1121 }, 1 );  // bits 0, 5, 6, 10
		assertAdvance( idSet );
	}

	public void testAdvanceDocIdBitSet() throws IOException {
		BitSet bitSet = new BitSet();
		bitSet.set( 0 );
		bitSet.set( 5 );
		bitSet.set( 6 );
		bitSet.set( 10 );
		DocIdSet idSet = new DocIdBitSet(bitSet);
		assertAdvance( idSet );
	}

	public void testAdvanceWithSortedVIntList() throws IOException {
		DocIdSet idSet = new SortedVIntList( 0, 5, 6, 10 );
		assertAdvance( idSet );
	}	

	private void assertAdvance(DocIdSet idSet) throws IOException {
		DocIdSetIterator iter = idSet.iterator();
		int docId = iter.nextDoc();
		assertEquals( "First doc id should be 0", 0, docId );

		docId = iter.nextDoc();
		assertEquals( "Second doc id should be 5", 5, docId );

		docId = iter.advance( 5 );
		assertEquals( "Advancing iterator should return the next doc id", 6, docId );
	}
{code}

The javadoc for {{advance}} says:
{quote}
Advances to the first *beyond* the current whose document number is greater than or equal to _target_.
{quote}
This seems to indicate that {{SortedVIntList}} behaves correctly, whereas the other two don't. 
Just looking at the {{DocIdBitSet}} implementation advance is implemented as:
{code}
bitSet.nextSetBit(target);
{code}
where the docs of {{nextSetBit}} say:
{quote}
Returns the index of the first bit that is set to true that occurs *on or after* the specified starting index
{quote}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2736) Wrong implementation of DocIdSetIterator.advance

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034527#comment-13034527 ] 

Shai Erera commented on LUCENE-2736:
------------------------------------

Thanks Hardy for reporting that.

But I think this works exactly as documented? Note that the javadocs of advance() state "*beyond* the current whose document number is *greater than or equal* to target". Also, there's a note in the javadocs:

{noformat}
   * <b>NOTE:</b> when <code> target &le; current</code> implementations may opt 
   * not to advance beyond their current {@link #docID()}.
{noformat}

I think that the word 'beyond' is confusing here. Perhaps we can modify the javadocs to:

"Advances to the first document whose number is greater than or equal to target"

If there are no objections, or better wording, I'll commit this later today, but only to 3.2/4.0 and not 3.0.2

> Wrong implementation of DocIdSetIterator.advance 
> -------------------------------------------------
>
>                 Key: LUCENE-2736
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2736
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/other
>    Affects Versions: 3.0.2, 4.0
>            Reporter: Hardy Ferentschik
>
> Implementations of {{DocIdSetIterator}} behave differently when advanced is called. Taking the following test for {{OpenBitSet}}, {{DocIdBitSet}} and {{SortedVIntList}} only {{SortedVIntList}} passes the test:
> {code:title=org.apache.lucene.search.TestDocIdSet.java|borderStyle=solid}
> ...
> 	public void testAdvanceWithOpenBitSet() throws IOException {
> 		DocIdSet idSet = new OpenBitSet( new long[] { 1121 }, 1 );  // bits 0, 5, 6, 10
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceDocIdBitSet() throws IOException {
> 		BitSet bitSet = new BitSet();
> 		bitSet.set( 0 );
> 		bitSet.set( 5 );
> 		bitSet.set( 6 );
> 		bitSet.set( 10 );
> 		DocIdSet idSet = new DocIdBitSet(bitSet);
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceWithSortedVIntList() throws IOException {
> 		DocIdSet idSet = new SortedVIntList( 0, 5, 6, 10 );
> 		assertAdvance( idSet );
> 	}	
> 	private void assertAdvance(DocIdSet idSet) throws IOException {
> 		DocIdSetIterator iter = idSet.iterator();
> 		int docId = iter.nextDoc();
> 		assertEquals( "First doc id should be 0", 0, docId );
> 		docId = iter.nextDoc();
> 		assertEquals( "Second doc id should be 5", 5, docId );
> 		docId = iter.advance( 5 );
> 		assertEquals( "Advancing iterator should return the next doc id", 6, docId );
> 	}
> {code}
> The javadoc for {{advance}} says:
> {quote}
> Advances to the first *beyond* the current whose document number is greater than or equal to _target_.
> {quote}
> This seems to indicate that {{SortedVIntList}} behaves correctly, whereas the other two don't. 
> Just looking at the {{DocIdBitSet}} implementation advance is implemented as:
> {code}
> bitSet.nextSetBit(target);
> {code}
> where the docs of {{nextSetBit}} say:
> {quote}
> Returns the index of the first bit that is set to true that occurs *on or after* the specified starting index
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2736) Wrong implementation of DocIdSetIterator.advance

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034618#comment-13034618 ] 

Doron Cohen commented on LUCENE-2736:
-------------------------------------

Shai, with the modified text the NOTE on "implementations freedom to not advance beyond in some situations" becomes strange... I think that the original text stress the fact the "real intended" behavior is to do advance beyond current, just that for performance reasons the decision whether to advance beyond in some situations is left for implementation decision, and so, if caller provides a target which is not greater than current, it should be aware of this possibility. 

So I think it is perhaps better to either not modify this at all, or at most, to add "(see NOTE below)" just after "beyond":

{noformat}
-   * Advances to the first beyond the current whose document number is greater
+   * Advances to the first beyond (see NOTE below) the current whose document number is greater
{noformat}

This would prevent the confusion I think?

> Wrong implementation of DocIdSetIterator.advance 
> -------------------------------------------------
>
>                 Key: LUCENE-2736
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2736
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.2, 4.0
>            Reporter: Hardy Ferentschik
>            Assignee: Shai Erera
>         Attachments: LUCENE-2736.patch
>
>
> Implementations of {{DocIdSetIterator}} behave differently when advanced is called. Taking the following test for {{OpenBitSet}}, {{DocIdBitSet}} and {{SortedVIntList}} only {{SortedVIntList}} passes the test:
> {code:title=org.apache.lucene.search.TestDocIdSet.java|borderStyle=solid}
> ...
> 	public void testAdvanceWithOpenBitSet() throws IOException {
> 		DocIdSet idSet = new OpenBitSet( new long[] { 1121 }, 1 );  // bits 0, 5, 6, 10
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceDocIdBitSet() throws IOException {
> 		BitSet bitSet = new BitSet();
> 		bitSet.set( 0 );
> 		bitSet.set( 5 );
> 		bitSet.set( 6 );
> 		bitSet.set( 10 );
> 		DocIdSet idSet = new DocIdBitSet(bitSet);
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceWithSortedVIntList() throws IOException {
> 		DocIdSet idSet = new SortedVIntList( 0, 5, 6, 10 );
> 		assertAdvance( idSet );
> 	}	
> 	private void assertAdvance(DocIdSet idSet) throws IOException {
> 		DocIdSetIterator iter = idSet.iterator();
> 		int docId = iter.nextDoc();
> 		assertEquals( "First doc id should be 0", 0, docId );
> 		docId = iter.nextDoc();
> 		assertEquals( "Second doc id should be 5", 5, docId );
> 		docId = iter.advance( 5 );
> 		assertEquals( "Advancing iterator should return the next doc id", 6, docId );
> 	}
> {code}
> The javadoc for {{advance}} says:
> {quote}
> Advances to the first *beyond* the current whose document number is greater than or equal to _target_.
> {quote}
> This seems to indicate that {{SortedVIntList}} behaves correctly, whereas the other two don't. 
> Just looking at the {{DocIdBitSet}} implementation advance is implemented as:
> {code}
> bitSet.nextSetBit(target);
> {code}
> where the docs of {{nextSetBit}} say:
> {quote}
> Returns the index of the first bit that is set to true that occurs *on or after* the specified starting index
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2736) Wrong implementation of DocIdSetIterator.advance

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2736:
-------------------------------

    Attachment: LUCENE-2736.patch

Patch with Javadocs fixes. I will commit it later today.

> Wrong implementation of DocIdSetIterator.advance 
> -------------------------------------------------
>
>                 Key: LUCENE-2736
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2736
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.2, 4.0
>            Reporter: Hardy Ferentschik
>            Assignee: Shai Erera
>         Attachments: LUCENE-2736.patch
>
>
> Implementations of {{DocIdSetIterator}} behave differently when advanced is called. Taking the following test for {{OpenBitSet}}, {{DocIdBitSet}} and {{SortedVIntList}} only {{SortedVIntList}} passes the test:
> {code:title=org.apache.lucene.search.TestDocIdSet.java|borderStyle=solid}
> ...
> 	public void testAdvanceWithOpenBitSet() throws IOException {
> 		DocIdSet idSet = new OpenBitSet( new long[] { 1121 }, 1 );  // bits 0, 5, 6, 10
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceDocIdBitSet() throws IOException {
> 		BitSet bitSet = new BitSet();
> 		bitSet.set( 0 );
> 		bitSet.set( 5 );
> 		bitSet.set( 6 );
> 		bitSet.set( 10 );
> 		DocIdSet idSet = new DocIdBitSet(bitSet);
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceWithSortedVIntList() throws IOException {
> 		DocIdSet idSet = new SortedVIntList( 0, 5, 6, 10 );
> 		assertAdvance( idSet );
> 	}	
> 	private void assertAdvance(DocIdSet idSet) throws IOException {
> 		DocIdSetIterator iter = idSet.iterator();
> 		int docId = iter.nextDoc();
> 		assertEquals( "First doc id should be 0", 0, docId );
> 		docId = iter.nextDoc();
> 		assertEquals( "Second doc id should be 5", 5, docId );
> 		docId = iter.advance( 5 );
> 		assertEquals( "Advancing iterator should return the next doc id", 6, docId );
> 	}
> {code}
> The javadoc for {{advance}} says:
> {quote}
> Advances to the first *beyond* the current whose document number is greater than or equal to _target_.
> {quote}
> This seems to indicate that {{SortedVIntList}} behaves correctly, whereas the other two don't. 
> Just looking at the {{DocIdBitSet}} implementation advance is implemented as:
> {code}
> bitSet.nextSetBit(target);
> {code}
> where the docs of {{nextSetBit}} say:
> {quote}
> Returns the index of the first bit that is set to true that occurs *on or after* the specified starting index
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2736) Wrong implementation of DocIdSetIterator.advance

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2736:
-------------------------------

          Component/s:     (was: core/other)
                       core/search
    Affects Version/s:     (was: 3.0.2)
                       3.2
             Assignee: Shai Erera

> Wrong implementation of DocIdSetIterator.advance 
> -------------------------------------------------
>
>                 Key: LUCENE-2736
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2736
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.2, 4.0
>            Reporter: Hardy Ferentschik
>            Assignee: Shai Erera
>
> Implementations of {{DocIdSetIterator}} behave differently when advanced is called. Taking the following test for {{OpenBitSet}}, {{DocIdBitSet}} and {{SortedVIntList}} only {{SortedVIntList}} passes the test:
> {code:title=org.apache.lucene.search.TestDocIdSet.java|borderStyle=solid}
> ...
> 	public void testAdvanceWithOpenBitSet() throws IOException {
> 		DocIdSet idSet = new OpenBitSet( new long[] { 1121 }, 1 );  // bits 0, 5, 6, 10
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceDocIdBitSet() throws IOException {
> 		BitSet bitSet = new BitSet();
> 		bitSet.set( 0 );
> 		bitSet.set( 5 );
> 		bitSet.set( 6 );
> 		bitSet.set( 10 );
> 		DocIdSet idSet = new DocIdBitSet(bitSet);
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceWithSortedVIntList() throws IOException {
> 		DocIdSet idSet = new SortedVIntList( 0, 5, 6, 10 );
> 		assertAdvance( idSet );
> 	}	
> 	private void assertAdvance(DocIdSet idSet) throws IOException {
> 		DocIdSetIterator iter = idSet.iterator();
> 		int docId = iter.nextDoc();
> 		assertEquals( "First doc id should be 0", 0, docId );
> 		docId = iter.nextDoc();
> 		assertEquals( "Second doc id should be 5", 5, docId );
> 		docId = iter.advance( 5 );
> 		assertEquals( "Advancing iterator should return the next doc id", 6, docId );
> 	}
> {code}
> The javadoc for {{advance}} says:
> {quote}
> Advances to the first *beyond* the current whose document number is greater than or equal to _target_.
> {quote}
> This seems to indicate that {{SortedVIntList}} behaves correctly, whereas the other two don't. 
> Just looking at the {{DocIdBitSet}} implementation advance is implemented as:
> {code}
> bitSet.nextSetBit(target);
> {code}
> where the docs of {{nextSetBit}} say:
> {quote}
> Returns the index of the first bit that is set to true that occurs *on or after* the specified starting index
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-2736) Wrong implementation of DocIdSetIterator.advance

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-2736.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0
                   3.2
    Lucene Fields: [New, Patch Available]  (was: [New])

Thanks Doron - I changed the javadocs as you suggest.

Committed revision 1104159 (3x).
Committed revision 1104167 (trunk).

Thanks Hardy for reporting that !

> Wrong implementation of DocIdSetIterator.advance 
> -------------------------------------------------
>
>                 Key: LUCENE-2736
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2736
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.2, 4.0
>            Reporter: Hardy Ferentschik
>            Assignee: Shai Erera
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-2736.patch
>
>
> Implementations of {{DocIdSetIterator}} behave differently when advanced is called. Taking the following test for {{OpenBitSet}}, {{DocIdBitSet}} and {{SortedVIntList}} only {{SortedVIntList}} passes the test:
> {code:title=org.apache.lucene.search.TestDocIdSet.java|borderStyle=solid}
> ...
> 	public void testAdvanceWithOpenBitSet() throws IOException {
> 		DocIdSet idSet = new OpenBitSet( new long[] { 1121 }, 1 );  // bits 0, 5, 6, 10
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceDocIdBitSet() throws IOException {
> 		BitSet bitSet = new BitSet();
> 		bitSet.set( 0 );
> 		bitSet.set( 5 );
> 		bitSet.set( 6 );
> 		bitSet.set( 10 );
> 		DocIdSet idSet = new DocIdBitSet(bitSet);
> 		assertAdvance( idSet );
> 	}
> 	public void testAdvanceWithSortedVIntList() throws IOException {
> 		DocIdSet idSet = new SortedVIntList( 0, 5, 6, 10 );
> 		assertAdvance( idSet );
> 	}	
> 	private void assertAdvance(DocIdSet idSet) throws IOException {
> 		DocIdSetIterator iter = idSet.iterator();
> 		int docId = iter.nextDoc();
> 		assertEquals( "First doc id should be 0", 0, docId );
> 		docId = iter.nextDoc();
> 		assertEquals( "Second doc id should be 5", 5, docId );
> 		docId = iter.advance( 5 );
> 		assertEquals( "Advancing iterator should return the next doc id", 6, docId );
> 	}
> {code}
> The javadoc for {{advance}} says:
> {quote}
> Advances to the first *beyond* the current whose document number is greater than or equal to _target_.
> {quote}
> This seems to indicate that {{SortedVIntList}} behaves correctly, whereas the other two don't. 
> Just looking at the {{DocIdBitSet}} implementation advance is implemented as:
> {code}
> bitSet.nextSetBit(target);
> {code}
> where the docs of {{nextSetBit}} say:
> {quote}
> Returns the index of the first bit that is set to true that occurs *on or after* the specified starting index
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org