You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2010/08/02 23:22:40 UTC

Re: StatsComponent and sint?

: With an sint, it seems to have trouble if there are any documents with 
: null values for the field. It appears to decide that a null/empty/blank 
: value is -1325166535, and is thus the minimum value.

1) there is relaly no such thing as a "null" value for a field ... there 
are documents that have no value for that field -- but that's differnet 
then actually indexing a null value (Solr is not a RDBMS)

I attempted to reproduce the problem you are describing by chaning the 
solr 1.4.1 schema.xml so that the "popularity" field used type "sint" and 
then indexed all of the sample documents.  exactly one of those documents 
has no value for hte "popularity" field (id:UTF8TEST) and this is the 
results that i got from the following reuqest...

http://localhost:8983/solr/select/?wt=json&q=*%3A*%0D%0A&version=2.2&start=0&rows=00&indent=on&stats=true&stats.field=popularity
{
 "responseHeader":{
  "status":0,
  "QTime":1,
  "params":{
	"indent":"on",
	"start":"0",
	"q":"*:*\r\n",
	"stats":"true",
	"stats.field":"popularity",
	"wt":"json",
	"version":"2.2",
	"rows":"00"}},
 "response":{"numFound":19,"start":0,"docs":[]
 },
 "stats":{
  "stats_fields":{
	"popularity":{
	 "min":0.0,
	 "max":10.0,
	 "sum":102.0,
	 "count":18,
	 "missing":1,
	 "sumOfSquares":702.0,
	 "mean":5.666666666666667,
	 "stddev":2.700762419587999}}}}

As you can see, it correclty recognized that the "min" value was 0.0, and 
thta 1 of the 19 total docs had no value for that field.


If you can't reproduce these types of results with your own data, then we 
need to see a lot more details about your specific sitaution (schema.xml, 
raw data, query urls, results, etc...) to try and understand what you are 
seeing.


-Hoss


Re: StatsComponent and sint?

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Thanks Hoss, the problem was transient, I believe that my index had 
become corrupted (changed the schema but hadn't fully deleted all 
documents that had been using the previous version of the schema), my 
fault.