You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by VJ <ja...@gmail.com> on 2017/04/05 11:08:30 UTC

distinct records based on a field

Hi,


Is there any way to pass only distinct records (based on a field) out of a
solr query?
I want to facet the records based on a field but want to restrict the
results to distinct records before applying the facet.



Thanks,
VJ

Re: distinct records based on a field

Posted by Joel Bernstein <jo...@gmail.com>.
In Solr 6 you can do a sql SELECT DISTINCT ... query as well.

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Apr 5, 2017 at 9:11 AM, VJ <ja...@gmail.com> wrote:

> My document schema has fields like:
> A,B,C
> I am querying for documents with A="XYZ", suppose it returns 5 records
> A            B               C
> XYZ        Foo           cat1
> XYZ        Foo           cat2
> XYZ        Bar           cat1
> XYZ        Bar           cat1
> XYZ        Bar           cat2
>
> out of those 10 records there may be duplicate values for B and then I am
> faceting it on C,
> So I get something like
> Cat1:3 (Foo,Bar,Bar)
> Cat2:2 (Foo,Bar)
>
> but I want the output as
> Cat1:2 (Foo,Bar)
> Cat2:2 (Foo,Bar)
>
> Is it possible to achieve the desired output with solr query?
>
>
> Thanks,
> VJ
>
> On Wed, Apr 5, 2017 at 6:26 PM, Emir Arnautovic <
> emir.arnautovic@sematext.com> wrote:
>
> > Hi VJ,
> >
> > You can use field collapsing feature to do distinct (
> > https://cwiki.apache.org/confluence/display/solr/Result+Grouping) or
> > maybe you can use facet pivoting and pivot on distinct field to get
> number
> > of doc in each if needed (https://cwiki.apache.org/conf
> > luence/display/solr/Faceting#Faceting-Pivot(DecisionTree)Faceting).
> >
> > You might also want to explore JSON facet API.
> >
> > HTH,
> > Emir
> >
> >
> >
> > On 05.04.2017 13:08, VJ wrote:
> >
> >> Hi,
> >>
> >>
> >> Is there any way to pass only distinct records (based on a field) out
> of a
> >> solr query?
> >> I want to facet the records based on a field but want to restrict the
> >> results to distinct records before applying the facet.
> >>
> >>
> >>
> >> Thanks,
> >> VJ
> >>
> >>
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
>

Re: distinct records based on a field

Posted by Emir Arnautovic <em...@sematext.com>.
You cannot use field collapsing on these fields and get correct result. 
You need to collapse on BC pair. If you introduce field D that is BC you 
can use something like:
   q=A:"XYZ"&fq={!collapse field=D}&facet=true&facet.field=C

Collapse query parser will make sure only 1 BC pair is returned and 
faceting will return expected values.

Alternatively, you can use facet pivot on existing structure to get 
expected number indirectly:

   q=A:"XYZ"&facet=true&facet.pivot=C,B

this will return facet_pivot section in response that contains C,B entry:

"C,B": [{ "field": "C", "value": "cat1", "count": 3, "pivot": [
   {"field": "B", "value": "Bar", "count": 2},
   {"field": "B", "value": "Foo", "count": 1}
]}...]

Note that it still returns 3 as count but what you are looking for is 
length of "pivot" array.

HTH,
Emir

On 05.04.2017 15:11, VJ wrote:
> My document schema has fields like:
> A,B,C
> I am querying for documents with A="XYZ", suppose it returns 5 records
> A            B               C
> XYZ        Foo           cat1
> XYZ        Foo           cat2
> XYZ        Bar           cat1
> XYZ        Bar           cat1
> XYZ        Bar           cat2
>
> out of those 10 records there may be duplicate values for B and then I am
> faceting it on C,
> So I get something like
> Cat1:3 (Foo,Bar,Bar)
> Cat2:2 (Foo,Bar)
>
> but I want the output as
> Cat1:2 (Foo,Bar)
> Cat2:2 (Foo,Bar)
>
> Is it possible to achieve the desired output with solr query?
>
>
> Thanks,
> VJ
>
> On Wed, Apr 5, 2017 at 6:26 PM, Emir Arnautovic <
> emir.arnautovic@sematext.com> wrote:
>
>> Hi VJ,
>>
>> You can use field collapsing feature to do distinct (
>> https://cwiki.apache.org/confluence/display/solr/Result+Grouping) or
>> maybe you can use facet pivoting and pivot on distinct field to get number
>> of doc in each if needed (https://cwiki.apache.org/conf
>> luence/display/solr/Faceting#Faceting-Pivot(DecisionTree)Faceting).
>>
>> You might also want to explore JSON facet API.
>>
>> HTH,
>> Emir
>>
>>
>>
>> On 05.04.2017 13:08, VJ wrote:
>>
>>> Hi,
>>>
>>>
>>> Is there any way to pass only distinct records (based on a field) out of a
>>> solr query?
>>> I want to facet the records based on a field but want to restrict the
>>> results to distinct records before applying the facet.
>>>
>>>
>>>
>>> Thanks,
>>> VJ
>>>
>>>
>> --
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>

-- 
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


Re: distinct records based on a field

Posted by VJ <ja...@gmail.com>.
My document schema has fields like:
A,B,C
I am querying for documents with A="XYZ", suppose it returns 5 records
A            B               C
XYZ        Foo           cat1
XYZ        Foo           cat2
XYZ        Bar           cat1
XYZ        Bar           cat1
XYZ        Bar           cat2

out of those 10 records there may be duplicate values for B and then I am
faceting it on C,
So I get something like
Cat1:3 (Foo,Bar,Bar)
Cat2:2 (Foo,Bar)

but I want the output as
Cat1:2 (Foo,Bar)
Cat2:2 (Foo,Bar)

Is it possible to achieve the desired output with solr query?


Thanks,
VJ

On Wed, Apr 5, 2017 at 6:26 PM, Emir Arnautovic <
emir.arnautovic@sematext.com> wrote:

> Hi VJ,
>
> You can use field collapsing feature to do distinct (
> https://cwiki.apache.org/confluence/display/solr/Result+Grouping) or
> maybe you can use facet pivoting and pivot on distinct field to get number
> of doc in each if needed (https://cwiki.apache.org/conf
> luence/display/solr/Faceting#Faceting-Pivot(DecisionTree)Faceting).
>
> You might also want to explore JSON facet API.
>
> HTH,
> Emir
>
>
>
> On 05.04.2017 13:08, VJ wrote:
>
>> Hi,
>>
>>
>> Is there any way to pass only distinct records (based on a field) out of a
>> solr query?
>> I want to facet the records based on a field but want to restrict the
>> results to distinct records before applying the facet.
>>
>>
>>
>> Thanks,
>> VJ
>>
>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>

Re: distinct records based on a field

Posted by Emir Arnautovic <em...@sematext.com>.
Hi VJ,

You can use field collapsing feature to do distinct 
(https://cwiki.apache.org/confluence/display/solr/Result+Grouping) or 
maybe you can use facet pivoting and pivot on distinct field to get 
number of doc in each if needed 
(https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-Pivot(DecisionTree)Faceting).

You might also want to explore JSON facet API.

HTH,
Emir


On 05.04.2017 13:08, VJ wrote:
> Hi,
>
>
> Is there any way to pass only distinct records (based on a field) out of a
> solr query?
> I want to facet the records based on a field but want to restrict the
> results to distinct records before applying the facet.
>
>
>
> Thanks,
> VJ
>

-- 
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


Re: distinct records based on a field

Posted by Binoy Dalal <bi...@gmail.com>.
Faceting will by default only get the distinct records.

On Wed 5 Apr, 2017, 16:38 VJ, <ja...@gmail.com> wrote:

> Hi,
>
>
> Is there any way to pass only distinct records (based on a field) out of a
> solr query?
> I want to facet the records based on a field but want to restrict the
> results to distinct records before applying the facet.
>
>
>
> Thanks,
> VJ
>
-- 
Regards,
Binoy Dalal