You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Way Cool <wa...@gmail.com> on 2011/06/14 20:31:25 UTC

How to avoid double counting for facet query

Hi, guys,

I fixed Solr search UI (solr/browse) to display the price range facet values
via
http://thetechietutorials.blogspot.com/2011/06/fix-price-facet-display-in-solr-search.htm
l:

   - Under 50<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B0.0+TO+50%5D>
   (1331)
   - [50.0 TO 100]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B50.0+TO+100%5D>
   (133)
   - [100.0 TO 150]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B100.0+TO+150%5D>
   (31)
   - [150.0 TO 200]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B150.0+TO+200%5D>
   (7)
   - [200.0 TO 250]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B200.0+TO+250%5D>
   (2)
   - [250.0 TO 300]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B250.0+TO+300%5D>
   (5)
   - [300.0 TO 350]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B300.0+TO+350%5D>
   (3)
   - [350.0 TO 400]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B350.0+TO+400%5D>
   (6)
   - [400.0 TO 450]<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B400.0+TO+450%5D>
   (1)
   - 600.0+<http://localhost:9090/solr/browse?&q=Shakespeare&fq=price:%5B600.0+TO+*%5D>(1)

However I am having double counting issue.

Here is the URL to only return docs whose prices are in between 110.0 and
160.0 and price facets:
http://localhost:8983/solr/select/?q=Shakespeare&version=2.2&rows=0&*
fq=price:[110.0+TO+160]*&*
facet.query=price:[110%20TO%20160]&facet.query=price:[160%20TO%20200]*
&facet.field=price

The response is as below:
*<result name="response" numFound="23" start="0" maxScore="0.37042576"/>
<lst name="facet_counts">
<lst name="facet_queries">
<int name="price:[110 TO 160]">23</int>
<int name="price:[160 TO 200]">1</int>
</lst>
...
</result>*

As you notice, the number of the results is 23, however an extra doc was
found in the 160-200 range.

Any way I can avoid double counting issue? Or does anyone have similar
issues?

Thanks,

YH

Re: How to avoid double counting for facet query

Posted by Ahmet Arslan <io...@yahoo.com>.
> That's good to know. From the ticket,
> looks like the fix will be in 4.0
> then?

It is already committed. You can use trunk:
svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk
 
> Currently I can see {} and [] worked, but not combined for
> Solr 3.1. I will
> try 3.2 soon. 

After re-thinking you can simulate the same thing by using a negative clause too : facet.query=price:[110 TO 160] -price:160 

I saw an facet by range example in solrconfig.xml. May be this will work for you?

http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range

    <int name="f.price.facet.range.start">0</int>
    <int name="f.price.facet.range.end">600</int>
    <int name="f.price.facet.range.gap">50</int>

Re: How to avoid double counting for facet query

Posted by Way Cool <wa...@gmail.com>.
That's good to know. From the ticket, looks like the fix will be in 4.0
then?

Currently I can see {} and [] worked, but not combined for Solr 3.1. I will
try 3.2 soon. Thanks.

On Tue, Jun 14, 2011 at 2:07 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > You sure Solr supports that?
> > I am getting exceptions by doing that. Ahmet, do you
> > remember where you see
> > that document? Thanks.
>
> I tested it with trunk.
> https://issues.apache.org/jira/browse/SOLR-355
> https://issues.apache.org/jira/browse/LUCENE-996
>
>

Re: How to avoid double counting for facet query

Posted by Ahmet Arslan <io...@yahoo.com>.
> You sure Solr supports that?
> I am getting exceptions by doing that. Ahmet, do you
> remember where you see
> that document? Thanks.

I tested it with trunk. 
https://issues.apache.org/jira/browse/SOLR-355
https://issues.apache.org/jira/browse/LUCENE-996


Re: How to avoid double counting for facet query

Posted by Way Cool <wa...@gmail.com>.
You sure Solr supports that?
I am getting exceptions by doing that. Ahmet, do you remember where you see
that document? Thanks.



On Tue, Jun 14, 2011 at 1:58 PM, Way Cool <wa...@gmail.com> wrote:

> Thanks! That's what I was trying to find.
>
>
> On Tue, Jun 14, 2011 at 1:48 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>
>> > <int name="price:[110 TO 160]">23</int>
>> > <int name="price:[160 TO 200]">1</int>
>> > </lst>
>> > ...
>> > </result>*
>> >
>> > As you notice, the number of the results is 23, however an
>> > extra doc was
>> > found in the 160-200 range.
>> >
>> > Any way I can avoid double counting issue?
>>
>> You can use exclusive range queries which are denoted by curly brackets.
>>
>> price:[110 TO 160}
>> price:[160 TO 200}
>>
>
>

Re: How to avoid double counting for facet query

Posted by Way Cool <wa...@gmail.com>.
Thanks! That's what I was trying to find.

On Tue, Jun 14, 2011 at 1:48 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > <int name="price:[110 TO 160]">23</int>
> > <int name="price:[160 TO 200]">1</int>
> > </lst>
> > ...
> > </result>*
> >
> > As you notice, the number of the results is 23, however an
> > extra doc was
> > found in the 160-200 range.
> >
> > Any way I can avoid double counting issue?
>
> You can use exclusive range queries which are denoted by curly brackets.
>
> price:[110 TO 160}
> price:[160 TO 200}
>

Re: How to avoid double counting for facet query

Posted by Way Cool <wa...@gmail.com>.
I just checked SolrQueryParser.java from 3.2.0 source. Looks like Yonik
Seeley's changes for
LUCENE-996<https://issues.apache.org/jira/browse/LUCENE-996>is not in.
I will check trunk later. Thanks!

On Tue, Jun 14, 2011 at 5:34 PM, Way Cool <wa...@gmail.com> wrote:

> I already checked out facet range query. By the way, I did put the
> facet.range.include as below:
> <str name="f.price.facet.range.include">lower</str>
>
> Couple things I don't like though are:
> 1. It returns the following without end values (I have to re-calculate the
> end values) :
> <lst name="counts">
> <int name="100.0">20</int>
> <int name="150.0">3</int>
> </lst>
> <float name="gap">50.0</float>
> <float name="start">0.0</float>
> <float name="end">600.0</float>
> <int name="before">0</int>
>
> 2. I can't specify custom ranges of values, for example, 1,2,3,4,5,...10,
> 15, 20, 30,40,50,60,80,90,100,200, ..., 600, 800, 900, 1000, 2000, ... etc.
>
> Thanks.
>
>
> On Tue, Jun 14, 2011 at 3:50 PM, Chris Hostetter <hossman_lucene@fucit.org
> > wrote:
>
>>
>> : You can use exclusive range queries which are denoted by curly brackets.
>>
>> that will solve the problem of making the fq exclude a bound, but
>> for the range facet counts you'll want to pay attention to look at
>> facet.range.include...
>>
>> http://wiki.apache.org/solr/SimpleFacetParameters#facet.range.include
>>
>>
>> -Hoss
>>
>
>

Re: How to avoid double counting for facet query

Posted by Way Cool <wa...@gmail.com>.
I already checked out facet range query. By the way, I did put the
facet.range.include as below:
<str name="f.price.facet.range.include">lower</str>

Couple things I don't like though are:
1. It returns the following without end values (I have to re-calculate the
end values) :
<lst name="counts">
<int name="100.0">20</int>
<int name="150.0">3</int>
</lst>
<float name="gap">50.0</float>
<float name="start">0.0</float>
<float name="end">600.0</float>
<int name="before">0</int>

2. I can't specify custom ranges of values, for example, 1,2,3,4,5,...10,
15, 20, 30,40,50,60,80,90,100,200, ..., 600, 800, 900, 1000, 2000, ... etc.

Thanks.

On Tue, Jun 14, 2011 at 3:50 PM, Chris Hostetter
<ho...@fucit.org>wrote:

>
> : You can use exclusive range queries which are denoted by curly brackets.
>
> that will solve the problem of making the fq exclude a bound, but
> for the range facet counts you'll want to pay attention to look at
> facet.range.include...
>
> http://wiki.apache.org/solr/SimpleFacetParameters#facet.range.include
>
>
> -Hoss
>

Re: How to avoid double counting for facet query

Posted by Chris Hostetter <ho...@fucit.org>.
: You can use exclusive range queries which are denoted by curly brackets.

that will solve the problem of making the fq exclude a bound, but 
for the range facet counts you'll want to pay attention to look at 
facet.range.include...

http://wiki.apache.org/solr/SimpleFacetParameters#facet.range.include


-Hoss

Re: How to avoid double counting for facet query

Posted by Ahmet Arslan <io...@yahoo.com>.
> <int name="price:[110 TO 160]">23</int>
> <int name="price:[160 TO 200]">1</int>
> </lst>
> ...
> </result>*
> 
> As you notice, the number of the results is 23, however an
> extra doc was
> found in the 160-200 range.
> 
> Any way I can avoid double counting issue? 

You can use exclusive range queries which are denoted by curly brackets.

price:[110 TO 160}
price:[160 TO 200}