You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mark Miller (JIRA)" <ji...@apache.org> on 2009/02/19 15:20:02 UTC

[jira] Created: (LUCENE-1542) NearSpansUnordered.getPayload does not always return the correct payloads when terms are located at the same position

NearSpansUnordered.getPayload does not always return the correct payloads when terms are located at the same position
---------------------------------------------------------------------------------------------------------------------

                 Key: LUCENE-1542
                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 2.4
            Reporter: Mark Miller
            Priority: Minor


More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1542:
---------------------------------------

    Attachment: LUCENE-1542.patch

New patch attached, that merges in Mark's test & fix, and the original
spans test (converted to a unit test) from LUCENE-1465, and adds
a method to IndexWriter to emulate the buggy behavior.

Previously, in LUCENE-1255, this problem was just an "oddity" that
Lucene would record position -1 for the first token(s) if those tokens
all have position incrment 0.  We started to fix it, realized it
breaks back-compat, and reverted it (accepting the "oddity").

Now, for this issue we are realizing the problem is much worse if a
payload happens to be attached to such tokens: instead of -1, the
position now comes back as Integer.MAX_VALUE (a side effect of how
payloads are stored in the index, which require that position delta be
non-zero), which then messes up *SpanQuery and I'm sure other things.
Subsequent tokens (once posIncr is > 0) then overflow int, and switch
to MIN_VALUE.

I think this is a real and nasty bug, and we should fix it, despite
back-compat.

So in the patch, I've added deprecated
IndexWriter.setAllowMinus1Postion() to get back to the buggy
behaviour, if for some reason an application needs this, and then
fixed the bug by default.


> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1542) NearSpansUnordered.getPayload does not always return the correct payloads when terms are located at the same position

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1542:
---------------------------------------

    Fix Version/s: 2.9

> NearSpansUnordered.getPayload does not always return the correct payloads when terms are located at the same position
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-1542:
------------------------------------------

    Assignee: Michael McCandless

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714459#action_12714459 ] 

Shai Erera commented on LUCENE-1542:
------------------------------------

I absolutely agree this is a bug that should be fixed. I was just worried with keeping the bug there, and forcing the app to reindex. I thought that can be avoided if we fix it under the cover, whenever merges occur. But I may be wrong.

Maybe when this new "segment-level metadata" comes in, we could have written some code which reads a Segment and based on its version fixes the positions.

Oh well .. it's just one more case where the app would need to reindex due to a bug fix (the other case I'm aware of is the invalid acronyms). I suppose that's acceptable, since it's a bug fix.

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714065#action_12714065 ] 

Shai Erera commented on LUCENE-1542:
------------------------------------

Just wanted to say we've had an internal discussion at work about it, when we wanted to utilize the positions to encode integers, and found out that we always need to increment the position returned by 1 (i.e., if you set posIncr to 5, the position you get when iterating on the positions is 4), and we specifically did not understand why when you set the posIncr to 0 for the first position, Lucene writes a -1. (Well, we understood why it happens, but didn't understand the reason).

So whatever you do here, I'm glad this issue was opened.

We figured that the right solution on our side, w/o changing the Lucene code, is to not set the posIncr for the first position, but do so from the 2nd forward. Maybe that's what we need to do in Lucene? I.e. if posIncr is 0 for the first position, we don't decrement by 1?

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1542:
--------------------------------

    Attachment: LUCENE-1542.patch

something like this to fix

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713999#action_12713999 ] 

Mark Miller commented on LUCENE-1542:
-------------------------------------

so that appears to fix it - but i'm not sure thats the *right* fix. have to look closer at why we do the -1, and then sometimes do 0 - 1 for a position. odd.

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714031#action_12714031 ] 

Mark Miller commented on LUCENE-1542:
-------------------------------------

and Spans. if its its included in a span, it will think the span starts ends at -1/+1 without payload it looks and with payloads +/- 2147483647 - or something to that effect.

Really, anything that counts on the position of the term is going to be screwed I think.

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1542.
----------------------------------------

    Resolution: Fixed

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714033#action_12714033 ] 

Mark Miller commented on LUCENE-1542:
-------------------------------------

I don't think the fix here needs to disallow -1, but I think it must put the tokens at the right positions, and that is not -1.

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1542:
--------------------------------

    Attachment: LUCENE-1542.patch

with unit test

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714454#action_12714454 ] 

Michael McCandless commented on LUCENE-1542:
--------------------------------------------

bq. My question is - when will those -1 positions be fixed?

I think the app must decide that?  I don't think we should correct it
during merging, since that'd sneakily change your index whenever
merges complete?

We could leave this deprecated "keep the bug" method around until 4.0?
This way you'd have until 4.0 to reindex.

bq. I think this breaks back-compat

Right, my patch breaks back compat, but I think this bug warrants an
exception.

This is a bad bad bug.  Not only does it corrupt your positions
(storing Int.MAX_VALUE instead of -1, and then storing the next
position as Int.MIN_VALUE), it also can allow that corruption to
spread as segments are merged (if those other segments didn't have
docs w/ payloads).  And, it causes Span*Query to return the wrong
results in some cases.

I think new users shouldn't have to wait until 4.0 to see this bug
fixed?

I suppose an alternate approach would be to leave the -1 bug in place,
and only fix the case when there are payloads.  It'd be messy.  I
think we'd have to fix SegmentTermPositions to add an "if (firstTime
&& pos==Integer.MAX_VALUE)" to rewire it back to -1.  If we did this
we'd be back to Lucene's "oddity".  It's not great because it's a perf
cost on the search side...


> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714577#action_12714577 ] 

Shai Erera commented on LUCENE-1542:
------------------------------------

I don't think there's a user-visible impact. This happens only if you set posIncr=0 for the very first token. I guess it's not a common thing (if at all), which is why we haven't heard of it?
I was worried though that if I have an index with -1s already encoded, and I don't call setAllowMinus1Position, then some of the positions will be -1 and some 0. And since it is deprecated, and will be removed in 3.0, I'll need to reindex in 3.0.

But I agree that a bug fix should not be carried into the internal Lucene processes. You should reindex.

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714400#action_12714400 ] 

Shai Erera commented on LUCENE-1542:
------------------------------------

So Mike, just for clarification - let's say I have an index with -1 encoded for positions, already. I then upgrade to 2.9 and realize I should set this on IndexWriter, so in fact more -1 positions will be encoded.

My question is - when will those -1 positions be fixed? I think this breaks back-compat since it changes the indexed data, and should be handled just like any other indexed data/format changes - i.e., last until 4.0. In the meantime, we can make sure that when segments are merged, or the index is optimized, or whatever else we do to support those back-compat issues, we fix those encodings, so that hopefully by 4.0 my indexes don't contain the -1s anymore (if they do, then I'm screwed and can choose between not upgrading to 4.0, or rebuild them).

If I'm right, then you don't need this deprecated method, and make the changes under the covers?

If I'm wrong, and our back-compat policy only covers index format changes, then I will already need to rebuild my indexes, so why wait until 3.0? Basically this is one of the cases that were discussed recently on the back-compat policy thread - a change to indexed data. One that we did not agree on (I vaguely remember we said it should be handled like index format changes, but I may be wrong).

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) NearSpansUnordered.getPayload does not always return the correct payloads when terms are located at the same position

Posted by "Jonathan Mamou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706365#action_12706365 ] 

Jonathan Mamou commented on LUCENE-1542:
----------------------------------------

I think that the bug is not related to payload and to the fact that terms at located at the same position. 
It seems to occur only for the first term of the document, if its positionIncrement is equal to 0. In this case, the position of the first term will be wrong: -1 if there is no payload, and 2147483647 if there is a payload.

> NearSpansUnordered.getPayload does not always return the correct payloads when terms are located at the same position
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714684#action_12714684 ] 

Michael McCandless commented on LUCENE-1542:
--------------------------------------------

OK I plan to commit the current patch shortly.  I'll add an entry under "Changes in runtime behavior" explaining the change...

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714020#action_12714020 ] 

Michael McCandless commented on LUCENE-1542:
--------------------------------------------

Alas, this looks like a dup of LUCENE-1255, where we at first did a fix (like this one) but then decided it was not back-compatible and so reverted it.

However, if that first token (with posIncr=0) also has a payload, it appears to be particularly disastrous, since the way we encode a payload (by left-shifting the position delta by 1 bit) does not preserve the -1, right?

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714481#action_12714481 ] 

Yonik Seeley commented on LUCENE-1542:
--------------------------------------

Most bug fixes aren't back compatible :-)
What's the real user-visible impact of this fix to someone who hasn't re-indexed?


> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1542.patch, LUCENE-1542.patch, LUCENE-1542.patch
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1542) Lucene can incorrectly set the position of tokens that start a field with positonInc 0.

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1542:
--------------------------------

    Summary: Lucene can incorrectly set the position of tokens that start a field with positonInc 0.  (was: NearSpansUnordered.getPayload does not always return the correct payloads when terms are located at the same position)

> Lucene can incorrectly set the position of tokens that start a field with positonInc 0.
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1542
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>
> More info in LUCENE-1465

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org