You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by solr-user <so...@hotmail.com> on 2010/02/15 21:07:21 UTC

getting unexpected statscomponent values

Has anyone encountered the following issue?

I wanted to understand the statscomponent better, so I setup a simple test
index with a few thousand docs.  In my schema I have:
-	an indexed multivalue sint field (StatsFacetField) that can contain values
0 thru 5 that I want to use as my stats.facet field.
-	an indexed single value sint field (ValueOfOneField) that will always
contain the value 1 and that I want stats on for this test

When I execute the following query:

http://localhost:8080/solr/select?q=*:*&stats=true&stats.field=ValueOfOneField&stats.facet=StatsFacetField&rows=0&facet=on&facet.limit=10&facet.field=StatsFacetField

For this situation (*:*) I was expecting that the statscomponent Count/Sum
values for each possible value in StatsFacetField to match the facet values
for StatsFacetField.  They don’t.  Some are close (ie 204 vs 214) while
others are way off (ie 230 vs 8000)

Shouldn’t the values match up?  If not, why?

I am using a recent copy of 1.5.0-dev solr ($Id: CHANGES.txt 906924
2010-02-05 12:43:11Z noble $)
-- 
View this message in context: http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27599248.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: getting unexpected statscomponent values

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
solr-user wrote:
> Hossman, what do you mean by including a "TestCase"?  
>
> Will create issue in Jira asap; I will include the URL, schema and some code
> to generate sample data
>   
I think they are good for TestCase.

Koji

-- 
http://www.rondhuit.com/en/


Re: getting unexpected statscomponent values

Posted by gdeconto <ge...@topproducer.com>.

Erick Erickson wrote:
> 
> It's especially helpful if you can take a bit of time to pare away
> all the unnecessary stuff in your example files and/or comment
> what you think the important bits are.....
> 

entered as SOLR-1782 in jira
-- 
View this message in context: http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27710509.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: getting unexpected statscomponent values

Posted by Erick Erickson <er...@gmail.com>.
SOLR makes heavy use of JUnit for testing. The real advantage
of a JUnit testcase being attached is that it can then be
permanently incorporated into the SOLR builds. If you're
unfamiliar with JUnit, then providing the raw data that illustrates
the bug allows people who work on SOLR to save a bunch
of time trying to reproduce the problem. It also insures that
they are addressing what you're seeing <G>...

It's especially helpful if you can take a bit of time to pare away
all the unnecessary stuff in your example files and/or comment
what you think the important bits are.....

HTH
Erick

On Wed, Feb 17, 2010 at 5:46 PM, solr-user <so...@hotmail.com> wrote:

>
>
> hossman wrote:
> >
> >
> > That does look really weird, and definitely seems like a bug.
> >
> > Can you open an issue in Jira? ... ideally with a TestCase (even if it's
> > not a JUnit test case, just having some sample docs that can be indexed
> > against the example schema and a URL showing the problem would be
> helpful)
> >
> >
>
> Hossman, what do you mean by including a "TestCase"?
>
> Will create issue in Jira asap; I will include the URL, schema and some
> code
> to generate sample data
> --
> View this message in context:
> http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27631633.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: getting unexpected statscomponent values

Posted by solr-user <so...@hotmail.com>.

hossman wrote:
> 
> 
> That does look really weird, and definitely seems like a bug.
> 
> Can you open an issue in Jira? ... ideally with a TestCase (even if it's 
> not a JUnit test case, just having some sample docs that can be indexed 
> against the example schema and a URL showing the problem would be helpful)
> 
> 

Hossman, what do you mean by including a "TestCase"?  

Will create issue in Jira asap; I will include the URL, schema and some code
to generate sample data
-- 
View this message in context: http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27631633.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: getting unexpected statscomponent values

Posted by Chris Hostetter <ho...@fucit.org>.
: Sure.  This is what I get.

That does look really weird, and definitely seems like a bug.

Can you open an issue in Jira? ... ideally with a TestCase (even if it's 
not a JUnit test case, just having some sample docs that can be indexed 
against the example schema and a URL showing the problem would be helpful)


:   <?xml version="1.0" encoding="UTF-8" ?> 
: - <response>
: - <lst name="responseHeader">
:   <int name="status">0</int> 
:   <int name="QTime">62</int> 
: - <lst name="params">
:   <str name="facet">on</str> 
:   <str name="q">*:*</str> 
:   <str name="stats">true</str> 
:   <str name="stats.field">ValueOfOne</str> 
:   <str name="facet.limit">10</str> 
:   <str name="stats.facet">StatsFacetField</str> 
:   <str name="facet.field">StatsFacetField</str> 
:   <str name="rows">0</str> 
:   </lst>
:   </lst>
:   <result name="response" numFound="8627" start="0" /> 
: - <lst name="facet_counts">
:   <lst name="facet_queries" /> 
: - <lst name="facet_fields">
: - <lst name="StatsFacetField">
:   <int name="0">1619</int> 
:   <int name="1">7433</int> 
:   <int name="2">7777</int> 
:   <int name="3">3984</int> 
:   <int name="4">233</int> 
:   <int name="5">41</int> 
:   </lst>
:   </lst>
:   <lst name="facet_dates" /> 
:   </lst>
: - <lst name="stats">
: - <lst name="stats_fields">
: - <lst name="ValueOfOne">
:   <double name="min">1.0</double> 
:   <double name="max">1.0</double> 
:   <double name="sum">8627.0</double> 
:   <long name="count">8627</long> 
:   <long name="missing">0</long> 
:   <double name="sumOfSquares">8627.0</double> 
:   <double name="mean">1.0</double> 
:   <double name="stddev">0.0</double> 
: - <lst name="facets">
: - <lst name="StatsFacetField">
: - <lst name="3">
:   <double name="min">1.0</double> 
:   <double name="max">1.0</double> 
:   <double name="sum">3758.0</double> 
:   <long name="count">3758</long> 
:   <long name="missing">0</long> 
:   <double name="sumOfSquares">3758.0</double> 
:   <double name="mean">1.0</double> 
:   <double name="stddev">0.0</double> 
:   </lst>
: - <lst name="2">
:   <double name="min">1.0</double> 
:   <double name="max">1.0</double> 
:   <double name="sum">3915.0</double> 
:   <long name="count">3915</long> 
:   <long name="missing">0</long> 
:   <double name="sumOfSquares">3915.0</double> 
:   <double name="mean">1.0</double> 
:   <double name="stddev">0.0</double> 
:   </lst>
: - <lst name="1">
:   <double name="min">1.0</double> 
:   <double name="max">1.0</double> 
:   <double name="sum">265.0</double> 
:   <long name="count">265</long> 
:   <long name="missing">0</long> 
:   <double name="sumOfSquares">265.0</double> 
:   <double name="mean">1.0</double> 
:   <double name="stddev">0.0</double> 
:   </lst>
: - <lst name="0">
:   <double name="min">1.0</double> 
:   <double name="max">1.0</double> 
:   <double name="sum">37.0</double> 
:   <long name="count">37</long> 
:   <long name="missing">0</long> 
:   <double name="sumOfSquares">37.0</double> 
:   <double name="mean">1.0</double> 
:   <double name="stddev">0.0</double> 
:   </lst>
: - <lst name="5">
:   <double name="min">1.0</double> 
:   <double name="max">1.0</double> 
:   <double name="sum">41.0</double> 
:   <long name="count">41</long> 
:   <long name="missing">0</long> 
:   <double name="sumOfSquares">41.0</double> 
:   <double name="mean">1.0</double> 
:   <double name="stddev">0.0</double> 
:   </lst>
: - <lst name="4">
:   <double name="min">1.0</double> 
:   <double name="max">1.0</double> 
:   <double name="sum">201.0</double> 
:   <long name="count">201</long> 
:   <long name="missing">0</long> 
:   <double name="sumOfSquares">201.0</double> 
:   <double name="mean">1.0</double> 
:   <double name="stddev">0.0</double> 
:   </lst>
:   </lst>
:   </lst>
:   </lst>
:   </lst>
:   </lst>
:   </response>
: -- 
: View this message in context: http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27631121.html
: Sent from the Solr - User mailing list archive at Nabble.com.
: 



-Hoss


Re: getting unexpected statscomponent values

Posted by solr-user <so...@hotmail.com>.

Grant Ingersoll-6 wrote:
> 
> Can you share the full output from the StatsComponent? 
> 

Sure.  This is what I get.

  <?xml version="1.0" encoding="UTF-8" ?> 
- <response>
- <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">62</int> 
- <lst name="params">
  <str name="facet">on</str> 
  <str name="q">*:*</str> 
  <str name="stats">true</str> 
  <str name="stats.field">ValueOfOne</str> 
  <str name="facet.limit">10</str> 
  <str name="stats.facet">StatsFacetField</str> 
  <str name="facet.field">StatsFacetField</str> 
  <str name="rows">0</str> 
  </lst>
  </lst>
  <result name="response" numFound="8627" start="0" /> 
- <lst name="facet_counts">
  <lst name="facet_queries" /> 
- <lst name="facet_fields">
- <lst name="StatsFacetField">
  <int name="0">1619</int> 
  <int name="1">7433</int> 
  <int name="2">7777</int> 
  <int name="3">3984</int> 
  <int name="4">233</int> 
  <int name="5">41</int> 
  </lst>
  </lst>
  <lst name="facet_dates" /> 
  </lst>
- <lst name="stats">
- <lst name="stats_fields">
- <lst name="ValueOfOne">
  <double name="min">1.0</double> 
  <double name="max">1.0</double> 
  <double name="sum">8627.0</double> 
  <long name="count">8627</long> 
  <long name="missing">0</long> 
  <double name="sumOfSquares">8627.0</double> 
  <double name="mean">1.0</double> 
  <double name="stddev">0.0</double> 
- <lst name="facets">
- <lst name="StatsFacetField">
- <lst name="3">
  <double name="min">1.0</double> 
  <double name="max">1.0</double> 
  <double name="sum">3758.0</double> 
  <long name="count">3758</long> 
  <long name="missing">0</long> 
  <double name="sumOfSquares">3758.0</double> 
  <double name="mean">1.0</double> 
  <double name="stddev">0.0</double> 
  </lst>
- <lst name="2">
  <double name="min">1.0</double> 
  <double name="max">1.0</double> 
  <double name="sum">3915.0</double> 
  <long name="count">3915</long> 
  <long name="missing">0</long> 
  <double name="sumOfSquares">3915.0</double> 
  <double name="mean">1.0</double> 
  <double name="stddev">0.0</double> 
  </lst>
- <lst name="1">
  <double name="min">1.0</double> 
  <double name="max">1.0</double> 
  <double name="sum">265.0</double> 
  <long name="count">265</long> 
  <long name="missing">0</long> 
  <double name="sumOfSquares">265.0</double> 
  <double name="mean">1.0</double> 
  <double name="stddev">0.0</double> 
  </lst>
- <lst name="0">
  <double name="min">1.0</double> 
  <double name="max">1.0</double> 
  <double name="sum">37.0</double> 
  <long name="count">37</long> 
  <long name="missing">0</long> 
  <double name="sumOfSquares">37.0</double> 
  <double name="mean">1.0</double> 
  <double name="stddev">0.0</double> 
  </lst>
- <lst name="5">
  <double name="min">1.0</double> 
  <double name="max">1.0</double> 
  <double name="sum">41.0</double> 
  <long name="count">41</long> 
  <long name="missing">0</long> 
  <double name="sumOfSquares">41.0</double> 
  <double name="mean">1.0</double> 
  <double name="stddev">0.0</double> 
  </lst>
- <lst name="4">
  <double name="min">1.0</double> 
  <double name="max">1.0</double> 
  <double name="sum">201.0</double> 
  <long name="count">201</long> 
  <long name="missing">0</long> 
  <double name="sumOfSquares">201.0</double> 
  <double name="mean">1.0</double> 
  <double name="stddev">0.0</double> 
  </lst>
  </lst>
  </lst>
  </lst>
  </lst>
  </lst>
  </response>
-- 
View this message in context: http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27631121.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: getting unexpected statscomponent values

Posted by Grant Ingersoll <gs...@apache.org>.
Can you share the full output from the StatsComponent? 
On Feb 15, 2010, at 3:07 PM, solr-user wrote:

> 
> Has anyone encountered the following issue?
> 
> I wanted to understand the statscomponent better, so I setup a simple test
> index with a few thousand docs.  In my schema I have:
> -	an indexed multivalue sint field (StatsFacetField) that can contain values
> 0 thru 5 that I want to use as my stats.facet field.
> -	an indexed single value sint field (ValueOfOneField) that will always
> contain the value 1 and that I want stats on for this test
> 
> When I execute the following query:
> 
> http://localhost:8080/solr/select?q=*:*&stats=true&stats.field=ValueOfOneField&stats.facet=StatsFacetField&rows=0&facet=on&facet.limit=10&facet.field=StatsFacetField
> 
> For this situation (*:*) I was expecting that the statscomponent Count/Sum
> values for each possible value in StatsFacetField to match the facet values
> for StatsFacetField.  They don’t.  Some are close (ie 204 vs 214) while
> others are way off (ie 230 vs 8000)
> 
> Shouldn’t the values match up?  If not, why?
> 
> I am using a recent copy of 1.5.0-dev solr ($Id: CHANGES.txt 906924
> 2010-02-05 12:43:11Z noble $)
> -- 
> View this message in context: http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27599248.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search