You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Naresh Yadav <ny...@gmail.com> on 2015/05/11 13:14:39 UTC

Solr query which return only those docs whose all tokens are from given list

Hi all,

Also asked this here : http://stackoverflow.com/questions/30166116

For example i have SOLR docs in which tags field is indexed :

Doc1 -> tags:T1 T2

Doc2 -> tags:T1 T3

Doc3 -> tags:T1 T4

Doc4 -> tags:T1 T2 T3

Query1 : get all docs with "tags:T1 AND tags:T3" then it works and will
give Doc2 and Doc4

Query2 : get all docs whose tags must be one of these [T1, T2, T3] Expected
is : Doc1, Doc2, Doc4

How to model Query2 in Solr ?? Please help me on this ?

Re: Solr query which return only those docs whose all tokens are from given list

Posted by Alessandro Benedetti <be...@gmail.com>.
A simple OR query should be fine :

tags:(T1 T2 T3)

Cheers

2015-05-11 15:39 GMT+01:00 Sujit Pal <su...@comcast.net>:

> Hi Naresh,
>
> Couldn't you could just model this as an OR query since your requirement is
> at least one (but can be more than one), ie:
>
> tags:T1 tags:T2 tags:T3
>
> -sujit
>
>
> On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav <ny...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Also asked this here : http://stackoverflow.com/questions/30166116
> >
> > For example i have SOLR docs in which tags field is indexed :
> >
> > Doc1 -> tags:T1 T2
> >
> > Doc2 -> tags:T1 T3
> >
> > Doc3 -> tags:T1 T4
> >
> > Doc4 -> tags:T1 T2 T3
> >
> > Query1 : get all docs with "tags:T1 AND tags:T3" then it works and will
> > give Doc2 and Doc4
> >
> > Query2 : get all docs whose tags must be one of these [T1, T2, T3]
> Expected
> > is : Doc1, Doc2, Doc4
> >
> > How to model Query2 in Solr ?? Please help me on this ?
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: Solr query which return only those docs whose all tokens are from given list

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Use update processor to add number of tags per doc. eg check
CountFieldValuesUpdateProcessorFactory

Doc1 -> tags:T1 T2 ; tagNum: 2

Doc2 -> tags:T1 T3 ; tagNum: 2

Doc3 -> tags:T1 T4 ; tagNum: 2

Doc4 -> tags:T1 T2 T3 ; tagNum: 3

than when you search for tags you need to get number of tags matched per
document, it can be done with recently implemented via ^=
eg

tags:(T1^=1 T2^=1 T3^=1)

then we need to subtract the expected number of tags per doc

q=sub(query($tagsAct)),tagNum)&tagsAct=tags:(T1^=1 T2^=1 T3^=1)

and then cut off the not enough coverage

 q={frange l=0}sub(query($tagsAct)),tagNum)&tagsAct=tags:(T1^=1 T2^=1 T3^=1)


On Wed, May 20, 2015 at 10:10 AM, Naresh Yadav <ny...@gmail.com> wrote:

> Requesting Solr experts again to suggest some solutions to my above problem
> as i am not able to solve this.
>
> On Tue, May 12, 2015 at 11:04 AM, Naresh Yadav <ny...@gmail.com>
> wrote:
>
> > Thanks Andrew, You got my problem precisely But solutions you suggested
> > may not work for me.
> >
> > In my API i get only list of tags authorized i.e [T1, T2, T3] and based
> on
> > that only i need to construct my Solr query.
> > So first solution with NOT (T4 OR T5) will not work.
> >
> > In real case tag ids T1, T2 are UUID's, so range query also will not work
> > as i have no control on order of these ids.
> >
> > Looking for more suggestions ??
> >
> > Thanks
> > Naresh
> >
> > On Mon, May 11, 2015 at 10:05 PM, Andrew Chillrud <
> achillrud@opentext.com>
> > wrote:
> >
> >> Based on his example, it sounds like Naresh not only wants the tags
> field
> >> to contain at least one of the values [T1, T2, T3] but also wants to
> >> exclude documents that contain a tag other than T1, T2, or T3 (Doc3
> should
> >> not be retrieved).
> >>
> >> If the set of possible values in the tags field is limited and known,
> you
> >> could use a NOT (or '-') clause to accomplish this. If there were 5
> >> possible tag values:
> >>
> >> tags:(( T1 OR T2 OR T3) NOT (T4 OR T5))
> >>
> >> However this doesn't seem practical if the number of possible values is
> >> large or unlimited. Perhaps something could be done with range queries:
> >>
> >> tags:(( T1 OR T2 OR T3) NOT ([* TO T1} OR {T1 TO T2} OR {T3 to * ]))
> >>
> >> however this would require whatever is constructing the query to be
> aware
> >> of the lexical ordering of the terms in the index. Maybe there are more
> >> elegant solutions, but I am not aware of them.
> >>
> >> - Andy -
> >>
> >> -----Original Message-----
> >> From: sujitatgtalk@gmail.com [mailto:sujitatgtalk@gmail.com] On Behalf
> >> Of Sujit Pal
> >> Sent: Monday, May 11, 2015 10:40 AM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Solr query which return only those docs whose all tokens
> are
> >> from given list
> >>
> >> Hi Naresh,
> >>
> >> Couldn't you could just model this as an OR query since your requirement
> >> is at least one (but can be more than one), ie:
> >>
> >> tags:T1 tags:T2 tags:T3
> >>
> >> -sujit
> >>
> >>
> >> On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav <ny...@gmail.com>
> >> wrote:
> >>
> >> > Hi all,
> >> >
> >> > Also asked this here : http://stackoverflow.com/questions/30166116
> >> >
> >> > For example i have SOLR docs in which tags field is indexed :
> >> >
> >> > Doc1 -> tags:T1 T2
> >> >
> >> > Doc2 -> tags:T1 T3
> >> >
> >> > Doc3 -> tags:T1 T4
> >> >
> >> > Doc4 -> tags:T1 T2 T3
> >> >
> >> > Query1 : get all docs with "tags:T1 AND tags:T3" then it works and
> >> > will give Doc2 and Doc4
> >> >
> >> > Query2 : get all docs whose tags must be one of these [T1, T2, T3]
> >> > Expected is : Doc1, Doc2, Doc4
> >> >
> >> > How to model Query2 in Solr ?? Please help me on this ?
> >> >
> >>
> >
> >
> >
> >
> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

Re: Solr query which return only those docs whose all tokens are from given list

Posted by Naresh Yadav <ny...@gmail.com>.
Requesting Solr experts again to suggest some solutions to my above problem
as i am not able to solve this.

On Tue, May 12, 2015 at 11:04 AM, Naresh Yadav <ny...@gmail.com> wrote:

> Thanks Andrew, You got my problem precisely But solutions you suggested
> may not work for me.
>
> In my API i get only list of tags authorized i.e [T1, T2, T3] and based on
> that only i need to construct my Solr query.
> So first solution with NOT (T4 OR T5) will not work.
>
> In real case tag ids T1, T2 are UUID's, so range query also will not work
> as i have no control on order of these ids.
>
> Looking for more suggestions ??
>
> Thanks
> Naresh
>
> On Mon, May 11, 2015 at 10:05 PM, Andrew Chillrud <ac...@opentext.com>
> wrote:
>
>> Based on his example, it sounds like Naresh not only wants the tags field
>> to contain at least one of the values [T1, T2, T3] but also wants to
>> exclude documents that contain a tag other than T1, T2, or T3 (Doc3 should
>> not be retrieved).
>>
>> If the set of possible values in the tags field is limited and known, you
>> could use a NOT (or '-') clause to accomplish this. If there were 5
>> possible tag values:
>>
>> tags:(( T1 OR T2 OR T3) NOT (T4 OR T5))
>>
>> However this doesn't seem practical if the number of possible values is
>> large or unlimited. Perhaps something could be done with range queries:
>>
>> tags:(( T1 OR T2 OR T3) NOT ([* TO T1} OR {T1 TO T2} OR {T3 to * ]))
>>
>> however this would require whatever is constructing the query to be aware
>> of the lexical ordering of the terms in the index. Maybe there are more
>> elegant solutions, but I am not aware of them.
>>
>> - Andy -
>>
>> -----Original Message-----
>> From: sujitatgtalk@gmail.com [mailto:sujitatgtalk@gmail.com] On Behalf
>> Of Sujit Pal
>> Sent: Monday, May 11, 2015 10:40 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr query which return only those docs whose all tokens are
>> from given list
>>
>> Hi Naresh,
>>
>> Couldn't you could just model this as an OR query since your requirement
>> is at least one (but can be more than one), ie:
>>
>> tags:T1 tags:T2 tags:T3
>>
>> -sujit
>>
>>
>> On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav <ny...@gmail.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > Also asked this here : http://stackoverflow.com/questions/30166116
>> >
>> > For example i have SOLR docs in which tags field is indexed :
>> >
>> > Doc1 -> tags:T1 T2
>> >
>> > Doc2 -> tags:T1 T3
>> >
>> > Doc3 -> tags:T1 T4
>> >
>> > Doc4 -> tags:T1 T2 T3
>> >
>> > Query1 : get all docs with "tags:T1 AND tags:T3" then it works and
>> > will give Doc2 and Doc4
>> >
>> > Query2 : get all docs whose tags must be one of these [T1, T2, T3]
>> > Expected is : Doc1, Doc2, Doc4
>> >
>> > How to model Query2 in Solr ?? Please help me on this ?
>> >
>>
>
>
>
>
>

Re: Solr query which return only those docs whose all tokens are from given list

Posted by Naresh Yadav <ny...@gmail.com>.
Thanks Andrew, You got my problem precisely But solutions you suggested may
not work for me.

In my API i get only list of tags authorized i.e [T1, T2, T3] and based on
that only i need to construct my Solr query.
So first solution with NOT (T4 OR T5) will not work.

In real case tag ids T1, T2 are UUID's, so range query also will not work
as i have no control on order of these ids.

Looking for more suggestions ??

Thanks
Naresh

On Mon, May 11, 2015 at 10:05 PM, Andrew Chillrud <ac...@opentext.com>
wrote:

> Based on his example, it sounds like Naresh not only wants the tags field
> to contain at least one of the values [T1, T2, T3] but also wants to
> exclude documents that contain a tag other than T1, T2, or T3 (Doc3 should
> not be retrieved).
>
> If the set of possible values in the tags field is limited and known, you
> could use a NOT (or '-') clause to accomplish this. If there were 5
> possible tag values:
>
> tags:(( T1 OR T2 OR T3) NOT (T4 OR T5))
>
> However this doesn't seem practical if the number of possible values is
> large or unlimited. Perhaps something could be done with range queries:
>
> tags:(( T1 OR T2 OR T3) NOT ([* TO T1} OR {T1 TO T2} OR {T3 to * ]))
>
> however this would require whatever is constructing the query to be aware
> of the lexical ordering of the terms in the index. Maybe there are more
> elegant solutions, but I am not aware of them.
>
> - Andy -
>
> -----Original Message-----
> From: sujitatgtalk@gmail.com [mailto:sujitatgtalk@gmail.com] On Behalf Of
> Sujit Pal
> Sent: Monday, May 11, 2015 10:40 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr query which return only those docs whose all tokens are
> from given list
>
> Hi Naresh,
>
> Couldn't you could just model this as an OR query since your requirement
> is at least one (but can be more than one), ie:
>
> tags:T1 tags:T2 tags:T3
>
> -sujit
>
>
> On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav <ny...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Also asked this here : http://stackoverflow.com/questions/30166116
> >
> > For example i have SOLR docs in which tags field is indexed :
> >
> > Doc1 -> tags:T1 T2
> >
> > Doc2 -> tags:T1 T3
> >
> > Doc3 -> tags:T1 T4
> >
> > Doc4 -> tags:T1 T2 T3
> >
> > Query1 : get all docs with "tags:T1 AND tags:T3" then it works and
> > will give Doc2 and Doc4
> >
> > Query2 : get all docs whose tags must be one of these [T1, T2, T3]
> > Expected is : Doc1, Doc2, Doc4
> >
> > How to model Query2 in Solr ?? Please help me on this ?
> >
>

RE: Solr query which return only those docs whose all tokens are from given list

Posted by Andrew Chillrud <ac...@opentext.com>.
Based on his example, it sounds like Naresh not only wants the tags field to contain at least one of the values [T1, T2, T3] but also wants to exclude documents that contain a tag other than T1, T2, or T3 (Doc3 should not be retrieved).

If the set of possible values in the tags field is limited and known, you could use a NOT (or '-') clause to accomplish this. If there were 5 possible tag values:

tags:(( T1 OR T2 OR T3) NOT (T4 OR T5))

However this doesn't seem practical if the number of possible values is large or unlimited. Perhaps something could be done with range queries:

tags:(( T1 OR T2 OR T3) NOT ([* TO T1} OR {T1 TO T2} OR {T3 to * ]))

however this would require whatever is constructing the query to be aware of the lexical ordering of the terms in the index. Maybe there are more elegant solutions, but I am not aware of them.

- Andy -

-----Original Message-----
From: sujitatgtalk@gmail.com [mailto:sujitatgtalk@gmail.com] On Behalf Of Sujit Pal
Sent: Monday, May 11, 2015 10:40 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr query which return only those docs whose all tokens are from given list

Hi Naresh,

Couldn't you could just model this as an OR query since your requirement is at least one (but can be more than one), ie:

tags:T1 tags:T2 tags:T3

-sujit


On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav <ny...@gmail.com> wrote:

> Hi all,
>
> Also asked this here : http://stackoverflow.com/questions/30166116
>
> For example i have SOLR docs in which tags field is indexed :
>
> Doc1 -> tags:T1 T2
>
> Doc2 -> tags:T1 T3
>
> Doc3 -> tags:T1 T4
>
> Doc4 -> tags:T1 T2 T3
>
> Query1 : get all docs with "tags:T1 AND tags:T3" then it works and 
> will give Doc2 and Doc4
>
> Query2 : get all docs whose tags must be one of these [T1, T2, T3] 
> Expected is : Doc1, Doc2, Doc4
>
> How to model Query2 in Solr ?? Please help me on this ?
>

Re: Solr query which return only those docs whose all tokens are from given list

Posted by Sujit Pal <su...@comcast.net>.
Hi Naresh,

Couldn't you could just model this as an OR query since your requirement is
at least one (but can be more than one), ie:

tags:T1 tags:T2 tags:T3

-sujit


On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav <ny...@gmail.com> wrote:

> Hi all,
>
> Also asked this here : http://stackoverflow.com/questions/30166116
>
> For example i have SOLR docs in which tags field is indexed :
>
> Doc1 -> tags:T1 T2
>
> Doc2 -> tags:T1 T3
>
> Doc3 -> tags:T1 T4
>
> Doc4 -> tags:T1 T2 T3
>
> Query1 : get all docs with "tags:T1 AND tags:T3" then it works and will
> give Doc2 and Doc4
>
> Query2 : get all docs whose tags must be one of these [T1, T2, T3] Expected
> is : Doc1, Doc2, Doc4
>
> How to model Query2 in Solr ?? Please help me on this ?
>