You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Paul Elschot (JIRA)" <ji...@apache.org> on 2016/08/02 20:17:20 UTC

[jira] [Comment Edited] (LUCENE-7398) Nested Span Queries are buggy

    [ https://issues.apache.org/jira/browse/LUCENE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404699#comment-15404699 ] 

Paul Elschot edited comment on LUCENE-7398 at 8/2/16 8:17 PM:
--------------------------------------------------------------

The patch changes the order in SpanPositionQueue which is used by SpanOr.
This works to pass the test case, and it does not increase complexity.

I think the problem is in NearSpansOrdered.stretchToOrder() which only does this:
{code}matchEnd = subSpans[subSpans.length - 1].endPosition();{code}
What is should also do is lookahead to see whether there is an ordered match with a smaller slop.

It could be that there still is a failing case with a nested SpanOr, possibly containing another nested SpanNear, but I'm not sure, this is tricky.

Since looking ahead increases the complexity (normal case runtime) I'd prefer to have the patch applied now, and see what the future brings.



was (Author: paul.elschot@xs4all.nl):
The patch changes the order in SpanPositionQueue which is used by SpanOr.
This works to pass the test case, and it does not increase complexity.

I think the problem is in NearSpansOrdered.stretchToOrder() which only does this:
{code}matchEnd = subSpans[subSpans.length - 1].endPosition();{code}
What is should also do is lookahead to see whether there is an ordered match with a smaller slop.

It could be that there still is a failing case with a nested SpanOr, possibly over containing another nested SpanNear, but I'm not sure, this is tricky.

Since looking ahead increases the complexity (normal case runtime) I'd prefer to have the patch applied now, and see what the future brings.


> Nested Span Queries are buggy
> -----------------------------
>
>                 Key: LUCENE-7398
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7398
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 5.5, 6.x
>            Reporter: Christoph Goller
>            Assignee: Alan Woodward
>            Priority: Critical
>         Attachments: LUCENE-7398.patch, TestSpanCollection.java
>
>
> Example for a nested SpanQuery that is not working:
> Document: Human Genome Organization , HUGO , is trying to coordinate gene mapping research worldwide.
> Query: spanNear([body:coordinate, spanOr([spanNear([body:gene, body:mapping], 0, true), body:gene]), body:research], 0, true)
> The query should match "coordinate gene mapping research" as well as "coordinate gene research". It does not match  "coordinate gene mapping research" with Lucene 5.5 or 6.1, it did however match with Lucene 4.10.4. It probably stopped working with the changes on SpanQueries in 5.3. I will attach a unit test that shows the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org