You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erik Hatcher (JIRA)" <ji...@apache.org> on 2015/07/09 17:34:06 UTC

[jira] [Commented] (SOLR-7721) When using group.facet=true there are duplicate facets returned

    [ https://issues.apache.org/jira/browse/SOLR-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620689#comment-14620689 ] 

Erik Hatcher commented on SOLR-7721:
------------------------------------

I've just tinkered with this myself.  Switching the field type of NormalizedAverageReview to "float" solves it.  The issue seems to be SimpleFacets#getGroupedCounts() doesn't work properly with trie fields with non-zero precisionStep like regular faceting and sorting is able to.

> When using group.facet=true there are duplicate facets returned 
> ----------------------------------------------------------------
>
>                 Key: SOLR-7721
>                 URL: https://issues.apache.org/jira/browse/SOLR-7721
>             Project: Solr
>          Issue Type: Bug
>          Components: Facet Module, faceting
>    Affects Versions: 4.10.4
>         Environment: JDK 1.8, Linux Ubuntu 14.4 LTS, 16 GB RAM, 2.9 GHZ CPU
>            Reporter: Ramzi Alqrainy
>
> I have the below schema, and I indexed documents with float fields
> {code:title=schema.xml|borderStyle=solid}
>  <field name="SKU" type="string" indexed="true" stored="true" multiValued="false"/>
>  <field name="ProductID" type="string" indexed="true" stored="true" multiValued="false"/>
> <field name="NormalizedAverageReview" type="tfloat" default="0.0" indexed="true" stored="false" multiValued="false"/>
> {code}
> Afterwards I add the following documents:
> {code:xml}
> <doc>
>     <field name="SKU">P1SKU00</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">0.0</field>
> </doc>
> <doc>
>     <field name="SKU">P1SKU05</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">0.5</field>
> </doc>
> <doc>
>     <field name="SKU">P1SKU10</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">1.0</field>
> </doc>
> <doc>
>     <field name="SKU">P1SKU15</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">1.5</field>
> </doc>
> <doc>
>     <field name="SKU">P1SKU20</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">2.0</field>
> </doc>
> <doc>
>     <field name="SKU">P1SKU25</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">2.5</field>
> </doc>
> <doc>
>     <field name="SKU">P1SKU30</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">3.0</field>
> </doc>
> <doc>
>     <field name="SKU">P1SKU35</field>
>     <field name="ProductID">P1</field>
>     <field name="NormalizedAverageReview">3.5</field>
> </doc>
> <doc>
>     <field name="SKU">P2SKU05</field>
>     <field name="ProductID">P2</field>
>     <field name="NormalizedAverageReview">0.5</field>
> </doc>
> <doc>
>     <field name="SKU">P2SKU15</field>
>     <field name="ProductID">P2</field>
>     <field name="NormalizedAverageReview">1.5</field>
> </doc>
> <doc>
>     <field name="SKU">P2SKU25</field>
>     <field name="ProductID">P2</field>
>     <field name="NormalizedAverageReview">2.5</field>
> </doc>
> <doc>
>     <field name="SKU">P2SKU35</field>
>     <field name="ProductID">P2</field>
>     <field name="NormalizedAverageReview">3.5</field>
> </doc>
> <doc>
>     <field name="SKU">P3SKU00</field>
>     <field name="ProductID">P3</field>
>     <field name="NormalizedAverageReview">0.0</field>
> </doc>
> <doc>
>     <field name="SKU">P3SKU10</field>
>     <field name="ProductID">P3</field>
>     <field name="NormalizedAverageReview">1.0</field>
> </doc>
> <doc>
>     <field name="SKU">P3SKU20</field>
>     <field name="ProductID">P3</field>
>     <field name="NormalizedAverageReview">2.0</field>
> </doc>
> <doc>
>     <field name="SKU">P3SKU30</field>
>     <field name="ProductID">P3</field>
>     <field name="NormalizedAverageReview">3.0</field>
> </doc>
> <doc>
>     <field name="SKU">P4SKU45</field>
>     <field name="ProductID">P4</field>
>     <field name="NormalizedAverageReview">4.5</field>
> </doc>
> <doc>
>     <field name="SKU">P4SKU50</field>
>     <field name="ProductID">P4</field>
>     <field name="NormalizedAverageReview">5.0</field>
> </doc>
> {code}
> After performing the following query
> http://localhost:8983/solr/collection1/select?q=ProductID:P1&wt=json&indent=true&facet=true&facet.field=NormalizedAverageReview&group=true&group.field=ProductID&group.ngroups=true&group.facet=true&group.limit=-1&indent=true
> There are duplicate facets returned
> {code:title=Response|borderStyle=solid}
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":41,
>     "params":{
>       "q":"ProductID:P1",
>       "facet.field":"NormalizedAverageReview",
>       "indent":["true",
>         "true"],
>       "group.limit":"-1",
>       "group.facet":"true",
>       "group.ngroups":"true",
>       "wt":"json",
>       "facet":"true",
>       "group.field":"ProductID",
>       "group":"true"}},
>   "grouped":{
>     "ProductID":{
>       "matches":8,
>       "ngroups":1,
>       "groups":[{
>           "groupValue":"P1",
>           "doclist":{"numFound":8,"start":0,"docs":[
>               {
>                 "SKU":"P1SKU00",
>                 "ProductID":"P1",
>                 "_version_":1504885971465797632,
>                 "tx_CategoryEnabled":false},
>               {
>                 "SKU":"P1SKU05",
>                 "ProductID":"P1",
>                 "_version_":1504885971547586560,
>                 "tx_CategoryEnabled":false},
>               {
>                 "SKU":"P1SKU10",
>                 "ProductID":"P1",
>                 "_version_":1504885971549683712,
>                 "tx_CategoryEnabled":false},
>               {
>                 "SKU":"P1SKU15",
>                 "ProductID":"P1",
>                 "_version_":1504885971551780864,
>                 "tx_CategoryEnabled":false},
>               {
>                 "SKU":"P1SKU20",
>                 "ProductID":"P1",
>                 "_version_":1504885971552829440,
>                 "tx_CategoryEnabled":false},
>               {
>                 "SKU":"P1SKU25",
>                 "ProductID":"P1",
>                 "_version_":1504885971554926592,
>                 "tx_CategoryEnabled":false},
>               {
>                 "SKU":"P1SKU30",
>                 "ProductID":"P1",
>                 "_version_":1504885971555975168,
>                 "tx_CategoryEnabled":false},
>               {
>                 "SKU":"P1SKU35",
>                 "ProductID":"P1",
>                 "_version_":1504885971557023744,
>                 "tx_CategoryEnabled":false}]
>           }}]}},
>   "facet_counts":{
>     "facet_queries":{},
>     "facet_fields":{
>       "NormalizedAverageReview":[
>         "0.0",1,
>         "0.5",1,
>         "1.0",1,
>         "1.5",1,
>         "2.0",1,
>         "2.5",1,
>         "3.0",1,
>         "3.5",1,
>         "0.0",0,
>         "0.5",0,
>         "1.0",0,
>         "1.5",0,
>         "2.0",0,
>         "2.5",0,
>         "3.0",0,
>         "3.5",0,
>         "4.5",0,
>         "5.0",0,
>         "4.5",0]},
>     "facet_dates":{},
>     "facet_ranges":{},
>     "facet_intervals":{}}}
> {code}
> If you notice the facet results, there are values repeated in there. For example 4.5 has 2 values (1, 0) as do many of the values. we would expect only 1 entry per value. Seems incorrect that there are duplicates with different values.
> Please advice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org