You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Vamsi Krishna D (JIRA)" <ji...@apache.org> on 2015/04/23 12:06:38 UTC

[jira] [Updated] (SOLR-7452) json facet api returning inconsistent counts

     [ https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vamsi Krishna D updated SOLR-7452:
----------------------------------
    Description: 
While using the newly added feature of json term facet api (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent returns of counts of faceted value ( Note I am running on a cloud mode of solr). For example consider that i have txns_id(unique field or key), consumer_number and amount. Now for a 10 million such records , lets say i query for 

q=*:*&rows=0&
 json.facet={
   biskatoo:{
	 type : terms,
       field : consumer_number,
       limit : 20,
	sort : {y:desc},
	numBuckets : true,
	facet:{
	 y : "sum(amount)"
       }
   }
 }


the results are as follows ( some are omitted ):

"facets":{
    "count":6641277,
    "biskatoo":{
      "numBuckets":3112708,
      "buckets":[{
          "val":"surya",
          "count":4,
          "y":2.264506},
      {
          "val":"raghu",
          "COUNT":3,   // capitalised for recognition 
          "y":1.8},
        {
          "val":"malli",
          "count":4,
          "y":1.78}]}}}

but if i restrict the query to 

q=consumer_number:raghu&rows=0&
 json.facet={
   biskatoo:{
	 type : terms,
       field : consumer_number,
       limit : 20,
	sort : {y:desc},
	numBuckets : true,
	facet:{
	 y : "sum(amount)"
       }
   }
 }


i get :

  "facets":{
    "count":4,
    "biskatoo":{
      "numBuckets":1,
      "buckets":[{
          "val":"raghu",
          "COUNT":4,
          "y":2429708.24}]}}}

One can see the count results are inconsistent ( and I found many occasions of inconsistencies).

I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but still the issue seems not resolved

  was:
While using the newly added feature of json term facet api (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent returns of counts of faceted value ( Note I am running on a cloud mode of solr). For example consider that i have txns_id(unique field or key), consumer_number and amount. Now for a 10 million such records , lets say i query for 

q=*:*&rows=0&
 json.facet={
   biskatoo:{
	 type : terms,
       field : consumer_number,
       limit : 20,
	sort : {y:desc},
	numBuckets : true,
	facet:{
	 y : "sum(amount)"
       }
   }
 }


the results are as follows ( some are omitted ):

"facets":{
    "count":6641277,
    "biskatoo":{
      "numBuckets":3112708,
      "buckets":[{
          "val":"surya",
          "count":4,
          "y":2.264506},
      {
          "val":"raghu",
          "COUNT":3,   // capitalised for recognition 
          "y":1.8},
        {
          "val":"malli",
          "count":4,
          "y":1.78}]}}}

but if i restrict the query to 

q=consumer_number:raghu&rows=0&
 json.facet={
   biskatoo:{
	 type : terms,
       field : consumer_number,
       limit : 20,
	sort : {y:desc},
	numBuckets : true,
	facet:{
	 y : "sum(amount)"
       }
   }
 }


i get :

  "facets":{
    "count":4,
    "biskatoo":{
      "numBuckets":1,
      "buckets":[{
          "val":"raghu",
          "COUNT":4,
          "y":2429708.24}]}}}

One can see the count results are inconsistent ( and I found many occasions of inconsistencies).


> json facet api returning inconsistent counts
> --------------------------------------------
>
>                 Key: SOLR-7452
>                 URL: https://issues.apache.org/jira/browse/SOLR-7452
>             Project: Solr
>          Issue Type: Bug
>          Components: faceting
>    Affects Versions: 5.1
>            Reporter: Vamsi Krishna D
>              Labels: count, facet, sort
>             Fix For: 5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> While using the newly added feature of json term facet api (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent returns of counts of faceted value ( Note I am running on a cloud mode of solr). For example consider that i have txns_id(unique field or key), consumer_number and amount. Now for a 10 million such records , lets say i query for 
> q=*:*&rows=0&
>  json.facet={
>    biskatoo:{
> 	 type : terms,
>        field : consumer_number,
>        limit : 20,
> 	sort : {y:desc},
> 	numBuckets : true,
> 	facet:{
> 	 y : "sum(amount)"
>        }
>    }
>  }
> the results are as follows ( some are omitted ):
> "facets":{
>     "count":6641277,
>     "biskatoo":{
>       "numBuckets":3112708,
>       "buckets":[{
>           "val":"surya",
>           "count":4,
>           "y":2.264506},
>       {
>           "val":"raghu",
>           "COUNT":3,   // capitalised for recognition 
>           "y":1.8},
>         {
>           "val":"malli",
>           "count":4,
>           "y":1.78}]}}}
> but if i restrict the query to 
> q=consumer_number:raghu&rows=0&
>  json.facet={
>    biskatoo:{
> 	 type : terms,
>        field : consumer_number,
>        limit : 20,
> 	sort : {y:desc},
> 	numBuckets : true,
> 	facet:{
> 	 y : "sum(amount)"
>        }
>    }
>  }
> i get :
>   "facets":{
>     "count":4,
>     "biskatoo":{
>       "numBuckets":1,
>       "buckets":[{
>           "val":"raghu",
>           "COUNT":4,
>           "y":2429708.24}]}}}
> One can see the count results are inconsistent ( and I found many occasions of inconsistencies).
> I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but still the issue seems not resolved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org