You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael Wechner <mi...@wyona.com> on 2022/08/31 15:15:36 UTC

How to filter KnnVectorQuery with multiple terms?

Hi

I am currently filtering a KnnVectorQuery as follows

Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD, classification));
query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);

but it is not clear to me how I can filter for multiple terms.

Should I subclass MultiTermQuery and use as filter, just as I use TermQuery as filter above?

Thanks

Michael

Re: How to filter KnnVectorQuery with multiple terms?

Posted by Michael Wechner <mi...@wyona.com>.
great, thank you very much for clarifying!

Michael

Am 01.09.22 um 08:43 schrieb Uwe Schindler:
> Simply said,
>
> the last parameter of KnnVectorQuery is a Lucene query, so you can 
> pass any query type there. TermInSetQuery is a good idea for doing a 
> "IN multiple terms" query. But you can also pass a BooleanQuery with 
> multiple terms or a combination of other queries, a numeric range,... 
> or a fulltext query out of Lucene's query parsers.
>
> Uwe
>
> Am 31.08.2022 um 22:19 schrieb Michael Wechner:
>> Hi Matt
>>
>> Thanks very much for your feedback!
>>
>> According to your links I will try
>>
>> Collection<BytesRef> terms =new ArrayList<BytesRef>();
>> terms.add(new BytesRef(classification1));
>> terms.add(new BytesRef(classification2));
>> Query filter =new TermInSetQuery(CLASSIFICATION_FIELD, terms);
>>
>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>
>> All the best
>>
>> Michael
>>
>>
>>
>> Am 31.08.22 um 20:24 schrieb Matt Davis:
>>> If I understand correctly, I believe you would want to use a 
>>> TermInSetQuery
>>> query.  An example usage can be found here
>>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398. 
>>>
>>>
>>>
>>> You can also check out the usage of KnnVectorQuery here:
>>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419 
>>>
>>> noting that in this case the getPreFilter method a few lines below 
>>> uses a
>>> BooleanQuery.Builder.
>>>
>>> As noted in TermsInSetQuery (
>>> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62) 
>>>
>>> multiple terms could be represented as a boolean query with 
>>> Occur.SHOULD.
>>>
>>> ~Matt
>>>
>>> On Wed, Aug 31, 2022 at 11:15 AM Michael 
>>> Wechner<mi...@wyona.com>
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> I am currently filtering a KnnVectorQuery as follows
>>>>
>>>> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
>>>> classification));
>>>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>>>
>>>> but it is not clear to me how I can filter for multiple terms.
>>>>
>>>> Should I subclass MultiTermQuery and use as filter, just as I use
>>>> TermQuery as filter above?
>>>>
>>>> Thanks
>>>>
>>>> Michael
>>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to filter KnnVectorQuery with multiple terms?

Posted by Uwe Schindler <uw...@thetaphi.de>.
Simply said,

the last parameter of KnnVectorQuery is a Lucene query, so you can pass 
any query type there. TermInSetQuery is a good idea for doing a "IN 
multiple terms" query. But you can also pass a BooleanQuery with 
multiple terms or a combination of other queries, a numeric range,... or 
a fulltext query out of Lucene's query parsers.

Uwe

Am 31.08.2022 um 22:19 schrieb Michael Wechner:
> Hi Matt
>
> Thanks very much for your feedback!
>
> According to your links I will try
>
> Collection<BytesRef> terms =new ArrayList<BytesRef>();
> terms.add(new BytesRef(classification1));
> terms.add(new BytesRef(classification2));
> Query filter =new TermInSetQuery(CLASSIFICATION_FIELD, terms);
>
> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>
> All the best
>
> Michael
>
>
>
> Am 31.08.22 um 20:24 schrieb Matt Davis:
>> If I understand correctly, I believe you would want to use a 
>> TermInSetQuery
>> query.  An example usage can be found here
>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398. 
>>
>>
>>
>> You can also check out the usage of KnnVectorQuery here:
>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419 
>>
>> noting that in this case the getPreFilter method a few lines below uses a
>> BooleanQuery.Builder.
>>
>> As noted in TermsInSetQuery (
>> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62) 
>>
>> multiple terms could be represented as a boolean query with Occur.SHOULD.
>>
>> ~Matt
>>
>> On Wed, Aug 31, 2022 at 11:15 AM Michael 
>> Wechner<mi...@wyona.com>
>> wrote:
>>
>>> Hi
>>>
>>> I am currently filtering a KnnVectorQuery as follows
>>>
>>> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
>>> classification));
>>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>>
>>> but it is not clear to me how I can filter for multiple terms.
>>>
>>> Should I subclass MultiTermQuery and use as filter, just as I use
>>> TermQuery as filter above?
>>>
>>> Thanks
>>>
>>> Michael
>>>
>
-- 
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: uwe@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to filter KnnVectorQuery with multiple terms?

Posted by Michael Wechner <mi...@wyona.com>.
Hi Matt

Thanks very much for your feedback!

According to your links I will try

Collection<BytesRef> terms =new ArrayList<BytesRef>();
terms.add(new BytesRef(classification1));
terms.add(new BytesRef(classification2));
Query filter =new TermInSetQuery(CLASSIFICATION_FIELD, terms);

query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);

All the best

Michael



Am 31.08.22 um 20:24 schrieb Matt Davis:
> If I understand correctly, I believe you would want to use a TermInSetQuery
> query.  An example usage can be found here
> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398.
>
>
> You can also check out the usage of KnnVectorQuery here:
> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419
> noting that in this case the getPreFilter method a few lines below uses a
> BooleanQuery.Builder.
>
> As noted in TermsInSetQuery (
> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62)
> multiple terms could be represented as a boolean query with Occur.SHOULD.
>
> ~Matt
>
> On Wed, Aug 31, 2022 at 11:15 AM Michael Wechner<mi...@wyona.com>
> wrote:
>
>> Hi
>>
>> I am currently filtering a KnnVectorQuery as follows
>>
>> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
>> classification));
>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>
>> but it is not clear to me how I can filter for multiple terms.
>>
>> Should I subclass MultiTermQuery and use as filter, just as I use
>> TermQuery as filter above?
>>
>> Thanks
>>
>> Michael
>>

Re: How to filter KnnVectorQuery with multiple terms?

Posted by Matt Davis <kr...@gmail.com>.
If I understand correctly, I believe you would want to use a TermInSetQuery
query.  An example usage can be found here
https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398.


You can also check out the usage of KnnVectorQuery here:
https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419
noting that in this case the getPreFilter method a few lines below uses a
BooleanQuery.Builder.

As noted in TermsInSetQuery (
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62)
multiple terms could be represented as a boolean query with Occur.SHOULD.

~Matt

On Wed, Aug 31, 2022 at 11:15 AM Michael Wechner <mi...@wyona.com>
wrote:

> Hi
>
> I am currently filtering a KnnVectorQuery as follows
>
> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
> classification));
> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>
> but it is not clear to me how I can filter for multiple terms.
>
> Should I subclass MultiTermQuery and use as filter, just as I use
> TermQuery as filter above?
>
> Thanks
>
> Michael
>