You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Alan Woodward (Jira)" <ji...@apache.org> on 2019/12/18 10:22:00 UTC
[jira] [Created] (LUCENE-9099) Correctly handle repeats in ordered
and unordered intervals
Alan Woodward created LUCENE-9099:
-------------------------------------
Summary: Correctly handle repeats in ordered and unordered intervals
Key: LUCENE-9099
URL: https://issues.apache.org/jira/browse/LUCENE-9099
Project: Lucene - Core
Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
If you have repeating intervals in an ordered or unordered interval source, you currently get somewhat confusing behaviour:
* ORDERED(a, a, b) will return an extra interval over just `a b` if it first matches `a a b`, meaning that you can get incorrect results if used in a CONTAINING filter - CONTAINING(ORDERED(x, y), ORDERED(a, a, b)) will match on the document `a x a b y`
* UNORDERED(a, a) will match on documents that just containg a single `a`.
It is possible to deal with the unordered case when building sources by rewriting duplicates to nested ORDERED clauses, so that UNORDERED(a, b, c, a, b) becomes UNORDERED(ORDERED(a, a), ORDERED(b, b), c), but this then breaks MAXGAPS filtering.
We should try and fix this within intervals themselves.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org