You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Dawid Weiss <da...@gmail.com> on 2020/09/17 13:20:02 UTC
Fuzzy-phrase query with "holes" using intervals?
Hmm... Is there any way to express a query for a phrase-like sequence of tokens:
a b c d
but with potential "holes" (one or more terms missing):
- b c d
a - c d
a b - d
...
I've experimented with ordered(term("a"), term(b), ...), gaps and
atLeast but I can't get it to work. I could expand terms into several
queries manually but the number of potential subsets is quite large,
hence the question. Thanks for tips.
Dawid
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Fuzzy-phrase query with "holes" using intervals?
Posted by Dawid Weiss <da...@gmail.com>.
Thanks Alan. I don't think my foo is strong enough to dive deep into
implementing intervals... yet. :) I'll try to clean up what's active
on my plate and maybe later I'll return to this.
Dawid
On Thu, Sep 17, 2020 at 3:53 PM Alan Woodward <ro...@gmail.com> wrote:
>
> I think you need a sort of ‘ordered atLeast’ here. Currently atLeast() is a mixture of a disjunction and an unordered interval, it should be possible to add something that adds additional constraints to the sets that it finds. I think you’d need to write some code though, I can’t see a way of doing it with the current group of interval operators.
>
> > On 17 Sep 2020, at 14:20, Dawid Weiss <da...@gmail.com> wrote:
> >
> > Hmm... Is there any way to express a query for a phrase-like sequence of tokens:
> >
> > a b c d
> >
> > but with potential "holes" (one or more terms missing):
> >
> > - b c d
> > a - c d
> > a b - d
> > ...
> >
> > I've experimented with ordered(term("a"), term(b), ...), gaps and
> > atLeast but I can't get it to work. I could expand terms into several
> > queries manually but the number of potential subsets is quite large,
> > hence the question. Thanks for tips.
> >
> > Dawid
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Fuzzy-phrase query with "holes" using intervals?
Posted by Alan Woodward <ro...@gmail.com>.
I think you need a sort of ‘ordered atLeast’ here. Currently atLeast() is a mixture of a disjunction and an unordered interval, it should be possible to add something that adds additional constraints to the sets that it finds. I think you’d need to write some code though, I can’t see a way of doing it with the current group of interval operators.
> On 17 Sep 2020, at 14:20, Dawid Weiss <da...@gmail.com> wrote:
>
> Hmm... Is there any way to express a query for a phrase-like sequence of tokens:
>
> a b c d
>
> but with potential "holes" (one or more terms missing):
>
> - b c d
> a - c d
> a b - d
> ...
>
> I've experimented with ordered(term("a"), term(b), ...), gaps and
> atLeast but I can't get it to work. I could expand terms into several
> queries manually but the number of potential subsets is quite large,
> hence the question. Thanks for tips.
>
> Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org