You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mark Miller (JIRA)" <ji...@apache.org> on 2008/11/21 22:45:44 UTC
[jira] Created: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
NearSpansOrdered.getPayload does not return the payload from the minimum match span
-----------------------------------------------------------------------------------
Key: LUCENE-1465
URL: https://issues.apache.org/jira/browse/LUCENE-1465
Project: Lucene - Java
Issue Type: Bug
Components: Search
Affects Versions: 2.4
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
Attachments: LUCENE-1465.patch
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller updated LUCENE-1465:
--------------------------------
Attachment: LUCENE-1465.patch
That still wasn't quite right. A third test and a third fix. I am pretty sure this solves it, but my previous concerns still concern me.
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1465) NearSpansOrdered.getPayload does
not return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658531#action_12658531 ]
Mark Miller commented on LUCENE-1465:
-------------------------------------
Hmmm...I think thats true, but thats for finding 'a hit' on a document, not for finding every possible sequence of spans that could cause hit. Spans work by finding a minimum match, not greedily finding every match (which is a different algorithm).
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller updated LUCENE-1465:
--------------------------------
Attachment: LUCENE-1465.patch
Bah. Its even worse than that. Even after you get down to a min match, it might not meet the slop requirements! You have to load the payloads and then dump them if the slop is not met.
I don't like all this extra payload loading. Come to think of it, if you don't use the getPayload, your still paying for it! I don't have a way around it, but I don't like it. In this case, not only do you pay for loading, you also pay for loading the payloads of a bunch of possible matches that don't end up being a match!
Over a large index with lots of hits, its a lot of payloads to load...
I havn't thought about any of it at a high level, but I think this has to be addressed somehow...maybe you have to turn on payload collecting first, or it doesnt do it? We need something...
but until then, I think this still has to be fixed, and we are loading them one way or another now...might as well add a few more "possible" wrong loads (this last patch added a couple as well) to make the behavior correct - somewhat useless otherwise :)
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1465) NearSpansOrdered.getPayload does
not return the payload from the minimum match span
Posted by "Jonathan Mamou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658530#action_12658530 ]
Jonathan Mamou commented on LUCENE-1465:
----------------------------------------
Mark, I would expect to get 0,0,3,6,7,7 and not only 6,7,7.
As you wrote, "a SpanAndQuery could easily be a SpanNearQuery if a huge distance was allowed." at http://www.gossamer-threads.com/lists/lucene/java-user/51983
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller updated LUCENE-1465:
--------------------------------
Attachment: LUCENE-1465.patch
Fix + test
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Attachments: LUCENE-1465.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Jonathan Mamou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Mamou updated LUCENE-1465:
-----------------------------------
Attachment: Test.java
Hi
It seems that the fix does not cover the case where 2 terms are indexed at the same position.
I attach a sample program illustrating the issue. Each 2 terms are indexed at the same position.
Best regards,
Jonathan
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Reopened: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless reopened LUCENE-1465:
----------------------------------------
Lucene Fields: [New, Patch Available] (was: [Patch Available, New])
Let's backport fix to 2.4 branch (for eventual 2.4.1).
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-1465:
---------------------------------------
Lucene Fields: [New, Patch Available] (was: [Patch Available, New])
Fix Version/s: 2.4.1
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller updated LUCENE-1465:
--------------------------------
Fix Version/s: 2.9
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1465.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1465) NearSpansOrdered.getPayload does
not return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650290#action_12650290 ]
Mark Miller commented on LUCENE-1465:
-------------------------------------
I plan on committing this soon. This is a real deal breaker if you are trying to use the new getPayload API with ordered nearspans.
The attached path has java 1.5 code in the test which I'll remove.
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1465.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Resolved: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller resolved LUCENE-1465.
---------------------------------
Resolution: Fixed
Lucene Fields: [New, Patch Available] (was: [Patch Available, New])
This has been backported to 2.4 and is resolved. The unresolved dangling issue is a separate issue involving a different class, and is being tracked with LUCENE-1542.
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1465) NearSpansOrdered.getPayload does
not return the payload from the minimum match span
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653292#action_12653292 ]
Michael McCandless commented on LUCENE-1465:
--------------------------------------------
bq. Whats involved in a backport - just commit it to the 2.4 branch and thats all?
Yup. "svn merge" works well as long as the code hasn't diverged much, eg running this in a 2.4 branch checkout:
{code}
svn merge -r(N-1):N https://svn.apache.org/repos/asf/lucene/java/trunk
{code}
where N was the revision committed to trunk.
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1465) NearSpansOrdered.getPayload does
not return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657662#action_12657662 ]
Mark Miller commented on LUCENE-1465:
-------------------------------------
This is an odd one Jonathan. Its actually for the unordered case (the others were for the ordered). I am not exactly clear on whats going on yet.
When I look at the payloads coming back, it would seem we are get 0,7,7 when we should get 6,7,7. When I look at the offsets for the spans that I get the payloads from though - they appear correct. Its returning the payloads from the right offsets it seems, but somehow one of those payloads is from the term at position 0? Very odd. So when I debug in, it does indeed look like the first match happens at index 6...but the term offsets are start: 2147483647, end:-2147483648. What the heck? This is going to take some more time...
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1465) NearSpansOrdered.getPayload does
not return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653277#action_12653277 ]
Mark Miller commented on LUCENE-1465:
-------------------------------------
Whats involved in a backport - just commit it to the 2.4 branch and thats all?
Looks like I have to look into terms indexed at the same position first - I'll try to get to that soon.
- Mark
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Resolved: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller resolved LUCENE-1465.
---------------------------------
Resolution: Fixed
Lucene Fields: [New, Patch Available] (was: [New])
Thanks Jonathan and Greg!
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1465) NearSpansOrdered.getPayload does not
return the payload from the minimum match span
Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller updated LUCENE-1465:
--------------------------------
Attachment: LUCENE-1465.patch
> NearSpansOrdered.getPayload does not return the payload from the minimum match span
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1465
> URL: https://issues.apache.org/jira/browse/LUCENE-1465
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.4
> Reporter: Mark Miller
> Assignee: Mark Miller
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1465.patch, LUCENE-1465.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org