You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Arun Rangarajan <ar...@gmail.com> on 2017/12/18 18:59:10 UTC

Filtering Solr pivot facet values

Solr version: 6.6.0

There are two multi-valued string fields in my schema:
* interests
* hierarchy.

Goal is to run a pivot facet query on both these fields, but only for
specific values of `interests` field. This query:

```
/select
?wt=json
&rows=0
&q=interests:(hockey OR soccer)
&facet=true
&facet.pivot=interests,hierarchy
```

selects the correct documents, but since `interests` is a multi-valued
field, it gives the required counts for the interested values (hockey,
soccer), but also gives the counts for other values of `interests` in the
matching documents.

How to filter the pivot facet counts only for the values of `interests`
field specified in the 'q' param i.e. hockey and soccer in the example.
Essentially, is there an equivalent of
https://lucene.apache.org/solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms
for pivot facet query? Or are there alternate formats like JSON faceting
that may help here?

(Full disclosure: I asked the question on StackOverflow and got no response
so far:
https://stackoverflow.com/questions/47838619/filtering-solr-pivot-facet-values
)

Thanks.

Re: Filtering Solr pivot facet values

Posted by emerikusz <pe...@yahoo.com.INVALID>.
This works: &f.interests.facet.matches=hockey|soccer 



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Filtering Solr pivot facet values

Posted by Emir Arnautović <em...@sematext.com>.
It seems that there is something in latest Solr version that you might be able to use. From release notes:

“The new facet.matches parameter returns facet buckets only for terms
that match a regular expression.”

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 21 Dec 2017, at 18:45, Erick Erickson <er...@gmail.com> wrote:
> 
> You might be able to do some interesting with the JSON faceting
> approach, but I confess I don't know for sure.
> 
> Best,
> Erick
> 
> On Thu, Dec 21, 2017 at 8:17 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>> On 12/20/2017 2:40 PM, Arun Rangarajan wrote:
>>> 
>>> I think multi-select faceting does the opposite of what I want. I want the
>>> facet to include the filters.
>> 
>> 
>> You don't have any filters to include or exclude.  You would need fq
>> parameters to use multi-select faceting.  But as you say, it doesn't do what
>> you want anyway.
>> 
>> <snip>
>> 
>>> As you can see, hierarchy and interests are both multi-valued string
>>> fields.
>>> 
>>> I want pivot facet counts for the two fields: hierarchy and interests, but
>>> filtered for only two values of interests field: hockey, soccer.
>> 
>> 
>> <snip>
>> 
>>> The counts for hockey and soccer are correct. But I am also getting the
>>> facet counts for other values of interests (like tennis, futbol, etc.,)
>>> since these values match the query. I understand why this is happening.
>>> This is why I said I want to do something like
>>> 
>>> https://lucene.apache.org/solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms
>>> for facet pivots. Is there a way to do that?
>> 
>> 
>> I see now.  It's showing the other values because the fields are multivalued
>> and the matching documents actually do contain those values, so Solr is
>> working the way I expected it to, but your data is different than I was
>> thinking.  It's the multivalued aspect that makes this problematic.
>> 
>> I was not aware that you could limit the terms with field faceting. Either
>> the syntax to achieve what you want is different than what you are using, or
>> it just can't be done with pivot faceting at the moment because there are no
>> options to do it.  I'm guessing the latter, but since I am not familiar with
>> the code, I cannot say for sure.  Hopefully somebody else can speak up with
>> an option, but I'm not expecting that to happen.
>> 
>> Thanks,
>> Shawn


Re: Filtering Solr pivot facet values

Posted by Erick Erickson <er...@gmail.com>.
You might be able to do some interesting with the JSON faceting
approach, but I confess I don't know for sure.

Best,
Erick

On Thu, Dec 21, 2017 at 8:17 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> On 12/20/2017 2:40 PM, Arun Rangarajan wrote:
>>
>> I think multi-select faceting does the opposite of what I want. I want the
>> facet to include the filters.
>
>
> You don't have any filters to include or exclude.  You would need fq
> parameters to use multi-select faceting.  But as you say, it doesn't do what
> you want anyway.
>
> <snip>
>
>> As you can see, hierarchy and interests are both multi-valued string
>> fields.
>>
>> I want pivot facet counts for the two fields: hierarchy and interests, but
>> filtered for only two values of interests field: hockey, soccer.
>
>
> <snip>
>
>> The counts for hockey and soccer are correct. But I am also getting the
>> facet counts for other values of interests (like tennis, futbol, etc.,)
>> since these values match the query. I understand why this is happening.
>> This is why I said I want to do something like
>>
>> https://lucene.apache.org/solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms
>> for facet pivots. Is there a way to do that?
>
>
> I see now.  It's showing the other values because the fields are multivalued
> and the matching documents actually do contain those values, so Solr is
> working the way I expected it to, but your data is different than I was
> thinking.  It's the multivalued aspect that makes this problematic.
>
> I was not aware that you could limit the terms with field faceting. Either
> the syntax to achieve what you want is different than what you are using, or
> it just can't be done with pivot faceting at the moment because there are no
> options to do it.  I'm guessing the latter, but since I am not familiar with
> the code, I cannot say for sure.  Hopefully somebody else can speak up with
> an option, but I'm not expecting that to happen.
>
> Thanks,
> Shawn

Re: Filtering Solr pivot facet values

Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/20/2017 2:40 PM, Arun Rangarajan wrote:
> I think multi-select faceting does the opposite of what I want. I want the
> facet to include the filters.

You don't have any filters to include or exclude.  You would need fq 
parameters to use multi-select faceting.  But as you say, it doesn't do 
what you want anyway.

<snip>

> As you can see, hierarchy and interests are both multi-valued string fields.
> 
> I want pivot facet counts for the two fields: hierarchy and interests, but
> filtered for only two values of interests field: hockey, soccer.

<snip>

> The counts for hockey and soccer are correct. But I am also getting the
> facet counts for other values of interests (like tennis, futbol, etc.,)
> since these values match the query. I understand why this is happening.
> This is why I said I want to do something like
> https://lucene.apache.org/solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms
> for facet pivots. Is there a way to do that?

I see now.  It's showing the other values because the fields are 
multivalued and the matching documents actually do contain those values, 
so Solr is working the way I expected it to, but your data is different 
than I was thinking.  It's the multivalued aspect that makes this 
problematic.

I was not aware that you could limit the terms with field faceting. 
Either the syntax to achieve what you want is different than what you 
are using, or it just can't be done with pivot faceting at the moment 
because there are no options to do it.  I'm guessing the latter, but 
since I am not familiar with the code, I cannot say for sure.  Hopefully 
somebody else can speak up with an option, but I'm not expecting that to 
happen.

Thanks,
Shawn

Re: Filtering Solr pivot facet values

Posted by Arun Rangarajan <ar...@gmail.com>.
Thanks for your reply, Shawn.

I think multi-select faceting does the opposite of what I want. I want the
facet to include the filters.

Example:

The following 8 documents are the only ones in my Solr core:

[
  {"id": "1", "hierarchy": ["1", "16", "169"], "interests": ["soccer",
"futbol"]},
  {"id": "2", "hierarchy": ["1", "16", "162"], "interests": ["cricket",
"futbol"]},
  {"id": "3", "hierarchy": ["1", "14", "141"], "interests": ["hockey",
"soccer"]},
  {"id": "4", "hierarchy": ["1", "16", "162"], "interests": ["hockey",
"soccer", "tennis"]},
  {"id": "5", "hierarchy": ["1", "14", "142"], "interests": ["badminton"]},
  {"id": "6", "hierarchy": ["1", "14", "147"], "interests": ["soccer"]},
  {"id": "7", "hierarchy": ["1", "16", "168"], "interests": ["hockey",
"soccer", "tennis"]},
  {"id": "8", "hierarchy": ["1", "14", "140"], "interests": ["badminton"]}
]

As you can see, hierarchy and interests are both multi-valued string fields.

I want pivot facet counts for the two fields: hierarchy and interests, but
filtered for only two values of interests field: hockey, soccer.

The query I am running is:

/select
?wt=json
&rows=0
&q=interests:(hockey soccer)
&facet=true
&facet.pivot=hierarchy,interests

This gives the following result for the pivot facets:

"facet_pivot": {
    "hierarchy,interests": [
    {
      "field": "hierarchy",
      "value": "1",
      "count": 5,
      "pivot": [
        {"field": "interests", "value": "soccer", "count": 5},
        {"field": "interests", "value": "hockey", "count": 3},
        {"field": "interests", "value": "tennis", "count": 2},
        {"field": "interests", "value": "futbol", "count": 1}
      ]
    },
    {
      "field": "hierarchy",
      "value": "16",
      "count": 3,
      "pivot": [
        {"field": "interests", "value": "soccer", "count": 3},
        {"field": "interests", "value": "hockey", "count": 2},
        {"field": "interests", "value": "tennis", "count": 2},
        {"field": "interests", "value": "futbol", "count": 1}
      ]
    },
    ...
    ]
}

The counts for hockey and soccer are correct. But I am also getting the
facet counts for other values of interests (like tennis, futbol, etc.,)
since these values match the query. I understand why this is happening.
This is why I said I want to do something like
https://lucene.apache.org/solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms
for facet pivots. Is there a way to do that?

Thanks.



On Wed, Dec 20, 2017 at 1:07 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 12/20/2017 1:31 PM, Arun Rangarajan wrote:
> > Sorry to bother you again on this. Is there no way in Solr to filter
> pivot
> > facets?
> > [Or did I attract the wrath of the group by posting the question first on
> > StackOverflow? :-)]
>
> StackOverflow and this list are pretty much unaware of each other unless
> specific mention is made.  I don't care whether you ask on SO or not, or
> which one you ask first.
>
> You haven't provided actual output that you're seeing.  Can you provide
> actual response output from your queries and describe what you'd rather
> see instead?  With that information, we might be able to offer some ideas.
>
> In general, facets should never count documents that are not in the
> search results.
>
> Multi-select faceting offers a way to change that general behavior,
> though -- tagging specific fq parameters and asking the facet to exclude
> those filters.
>
> https://wiki.apache.org/solr/SimpleFacetParameters#Tagging_
> and_excluding_Filters
>
> Thanks,
> Shawn
>
>

Re: Filtering Solr pivot facet values

Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/20/2017 1:31 PM, Arun Rangarajan wrote:
> Sorry to bother you again on this. Is there no way in Solr to filter pivot
> facets?
> [Or did I attract the wrath of the group by posting the question first on
> StackOverflow? :-)]

StackOverflow and this list are pretty much unaware of each other unless
specific mention is made.  I don't care whether you ask on SO or not, or
which one you ask first.

You haven't provided actual output that you're seeing.  Can you provide
actual response output from your queries and describe what you'd rather
see instead?  With that information, we might be able to offer some ideas.

In general, facets should never count documents that are not in the
search results.

Multi-select faceting offers a way to change that general behavior,
though -- tagging specific fq parameters and asking the facet to exclude
those filters.

https://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters

Thanks,
Shawn


Re: Filtering Solr pivot facet values

Posted by Arun Rangarajan <ar...@gmail.com>.
Hello Solr Gurus,

Sorry to bother you again on this. Is there no way in Solr to filter pivot
facets?
[Or did I attract the wrath of the group by posting the question first on
StackOverflow? :-)]

Thanks once again.

On Mon, Dec 18, 2017 at 10:59 AM, Arun Rangarajan <ar...@gmail.com>
wrote:

> Solr version: 6.6.0
>
> There are two multi-valued string fields in my schema:
> * interests
> * hierarchy.
>
> Goal is to run a pivot facet query on both these fields, but only for
> specific values of `interests` field. This query:
>
> ```
> /select
> ?wt=json
> &rows=0
> &q=interests:(hockey OR soccer)
> &facet=true
> &facet.pivot=interests,hierarchy
> ```
>
> selects the correct documents, but since `interests` is a multi-valued
> field, it gives the required counts for the interested values (hockey,
> soccer), but also gives the counts for other values of `interests` in the
> matching documents.
>
> How to filter the pivot facet counts only for the values of `interests`
> field specified in the 'q' param i.e. hockey and soccer in the example.
> Essentially, is there an equivalent of https://lucene.apache.org/
> solr/guide/6_6/faceting.html#Faceting-Limitingfacetwithcertainterms for
> pivot facet query? Or are there alternate formats like JSON faceting that
> may help here?
>
> (Full disclosure: I asked the question on StackOverflow and got no
> response so far: https://stackoverflow.com/questions/47838619/
> filtering-solr-pivot-facet-values )
>
> Thanks.
>