You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael Gibney (JIRA)" <ji...@apache.org> on 2018/10/24 14:58:00 UTC
[jira] [Comment Edited] (LUCENE-8544) In SpanNearQuery, add support
for inOrder semantics equivalent to that of (Multi)PhraseQuery
[ https://issues.apache.org/jira/browse/LUCENE-8544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662390#comment-16662390 ]
Michael Gibney edited comment on LUCENE-8544 at 10/24/18 2:57 PM:
------------------------------------------------------------------
I can't immediately speak to the possibility of adding this functionality to the existing implementation of {{SpanNearQuery}}, but building on an outstanding patch for [LUCENE-7398|https://issues.apache.org/jira/browse/LUCENE-7398?focusedCommentId=16630529#comment-16630529], I think it might actually be pretty straightforward.
The above-referenced patch makes {{NearSpansOrdered}} aware of indexed {{PositionLengthAttribute}}. In a positionLength-aware context, it wasn't clear to me how to port the {{NearSpansOrdered}} changes to {{NearSpansUnordered}}; there were a number of ways to interpret the task, all of which looked pretty complicated and/or messy and/or difficult-verging-on-impossible to implement in a performant way (and at a higher level, they all seemed a bit semantically weird).
But positionLength-aware implementation of {{(Multi)PhraseQuery}} semantics in the context of the above-referenced patch should be much simpler: given that you have a fixed clause ordering, it just requires supporting negative offsets in calculation of slop/edit distance.
was (Author: mgibney):
I can't immediately speak to the possibility of adding this functionality to the existing implementation of {{SpanNearQuery}}, but building on an outstanding patch for LUCENE-7398, I think it might actually be pretty straightforward.
The above-referenced patch makes {{NearSpansOrdered}} aware of indexed {{PositionLengthAttribute}}. In a positionLength-aware context, it wasn't clear to me how to port the {{NearSpansOrdered}} changes to {{NearSpansUnordered}}; there were a number of ways to interpret the task, all of which looked pretty complicated and/or messy and/or difficult-verging-on-impossible to implement in a performant way (and at a higher level, they all seemed a bit semantically weird).
But positionLength-aware implementation of {{(Multi)PhraseQuery}} semantics in the context of the above-referenced patch should be much simpler: given that you have a fixed clause ordering, it just requires supporting negative offsets in calculation of slop/edit distance.
> In SpanNearQuery, add support for inOrder semantics equivalent to that of (Multi)PhraseQuery
> --------------------------------------------------------------------------------------------
>
> Key: LUCENE-8544
> URL: https://issues.apache.org/jira/browse/LUCENE-8544
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Michael Gibney
> Priority: Minor
>
> As discussed in LUCENE-8531, the semantics of phrase search differs among {{(Multi)PhraseQuery}}, {{SpanNearQuery (inOrder=true)}}, and {{SpanNearQuery (inOrder=false)}}:
> * {{(Multi)PhraseQuery}}: incorporates the concept of order, and allows negative offsets in calculating slop/edit distance
> * {{SpanNearQuery (inOrder=true)}}: incorporates the concept of order, and does _not_ allow negative offsets in calculating slop/edit distance
> * {{SpanNearQuery (inOrder=false)}}: does not incorporate the concept of order at all
> This issue concerns the possibility of adjusting {{SpanNearQuery}} to be configurable to support semantics equivalent to that of {{(Multi)PhraseQuery}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org