You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Jayson Minard (JIRA)" <ji...@apache.org> on 2009/02/20 18:07:01 UTC

[jira] Created: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Facet counts are not correct (or total document count is not correct as they do not match) on some searches
-----------------------------------------------------------------------------------------------------------

                 Key: SOLR-1030
                 URL: https://issues.apache.org/jira/browse/SOLR-1030
             Project: Solr
          Issue Type: Bug
          Components: search
    Affects Versions: 1.4
            Reporter: Jayson Minard


There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...

Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.

For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:

8,294,284

And the total document count for the same results is:

8,294,274

some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):

{code}
uerystring	  	(science)
QParser	  	OldLuceneQParser
filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
rawquerystring	  	(science)
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675415#action_12675415 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

The data is static, no updates.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675406#action_12675406 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

Possible commits that could impact this area of Solr:

Rev 738606 - Upgrade to Lucene 2.9-dev r738345 by Yonik
Rev 738950 - Upgrade to Lucene 2.9-dev r738622 by Yonik
Rev 739305 - Revert to r738218 of Lucene by Yonik

Rev 740319 - Change SolrIndexSearcher to use insertWithOverflow by Yonik

Rev 741710 - Addition of timeouts for distrubted searching by Shalin (maybe we time out on some shards?)

Rev 742988 - Make QueryComponent Optional by gsingers

Rev 743196 - Current lucene libs, SolrIndexReader introduction for FileFloatSource fix by Yonik

So possibly a Lucene bug?



> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675412#action_12675412 ] 

Shalin Shekhar Mangar commented on SOLR-1030:
---------------------------------------------

bq.maybe we time out on some shards?

A timeout will log an exception. Might be worth combing through the log files to be sure.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675405#action_12675405 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

The last known working revision should be 733,656 as that includes patches we passed through Shalin and counts matched at that time.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675466#action_12675466 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

Yeah, we had the assumption that the source data did not have duplicate docs across shards, turns out that it does.  Otherwise we would have checked that first.  I'll keep an eye on this one for a bit, but most likely just the duplicate doc issue.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley resolved SOLR-1030.
--------------------------------

    Resolution: Cannot Reproduce

Closing... assuming that this is a duplicate docs on shards issue since we haven't seen it elsewhere.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675447#action_12675447 ] 

Yonik Seeley commented on SOLR-1030:
------------------------------------

Yes, duplicate docs on shards would indeed cause facet counts to be too high.
Duplicate docs is an error condition that we can handle relatively gracefully, but not without some inconsistencies.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675418#action_12675418 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

We are reverting to r733656 to test there again and see if the problem is new since that revision or if we just now noticed it.  There is a bug in our database showing a count problem from last month but it appears to be related to duplicate docs on different shards rather than this issue.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jayson Minard updated SOLR-1030:
--------------------------------

    Comment: was deleted

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675416#action_12675416 ] 

jminard edited comment on SOLR-1030 at 2/20/09 10:16 AM:
---------------------------------------------------------------

And the data hasn't changed since the last version of Solr was used (roughly r733656) and either no one noticed it before, -or the problem didn't previously exist-.  Checking now for timeouts in the logs.

      was (Author: jminard):
    And the data hasn't changed since the last version of Solr was used (roughly r733656) and either no one noticed it before, or the problem didn't previously exist.  Checking now for timeouts in the logs.
  
> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675405#action_12675405 ] 

jminard edited comment on SOLR-1030 at 2/20/09 10:15 AM:
---------------------------------------------------------------

-The last known working revision should be 733,656 as that includes patches we passed through Shalin and counts matched at that time.-  Appears to be previous to this revision.

      was (Author: jminard):
    The last known working revision should be 733,656 as that includes patches we passed through Shalin and counts matched at that time.
  
> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675421#action_12675421 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

I just found a bug in our database from the 27th showing this issue dates back to r733656 as well and before.  So it is an older issue.

End up with cases where facets show counts such as 264 where total results are 249.  

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jayson Minard updated SOLR-1030:
--------------------------------

    Description: 
-There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-

-Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.

For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:

8,294,284

And the total document count for the same results is:

8,294,274

some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):

{code}
uerystring	  	(science)
QParser	  	OldLuceneQParser
filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
rawquerystring	  	(science)
{code}

  was:
There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...

Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.

For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:

8,294,284

And the total document count for the same results is:

8,294,274

some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):

{code}
uerystring	  	(science)
QParser	  	OldLuceneQParser
filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
rawquerystring	  	(science)
{code}


> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675416#action_12675416 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

And the data hasn't changed since the last version of Solr was used (roughly r733656) and either no one noticed it before, or the problem didn't previously exist.  Checking now for timeouts in the logs.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675413#action_12675413 ] 

Shalin Shekhar Mangar commented on SOLR-1030:
---------------------------------------------

Another point, consistency is not guaranteed between query phases in the current implementation. So if a commit happens on a shard in between a query phase and facet refinement phase, you may see different counts.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675436#action_12675436 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

I think this might indeed be duplicate docs between shards causing the difference.  As we increate page size from 0 to 20, going by 1 each time, we slowly get a higher and higher count difference.  And we find duplicate docs across shards in this result set for the page of results being shown.

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> -There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...-
> -Since updating to the tip from our previous use of the tip from around Jan 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

Posted by "Jayson Minard (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675417#action_12675417 ] 

Jayson Minard commented on SOLR-1030:
-------------------------------------

Logs on aggregator instance of Solr show no exceptions.  Logs on query slaves show no exceptions.  (or errors)

> Facet counts are not correct (or total document count is not correct as they do not match) on some searches
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1030
>                 URL: https://issues.apache.org/jira/browse/SOLR-1030
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jayson Minard
>
> There isn't much detailed evidence for this one yet, but hopefully it rings a bell with someone who made changes in this area recently...
> Since updating to the tip from our previous use of the tip from around Jan 9, 2009 we are now seeing facet counts no longer match total document count.  This is through distributed search and I have not verified that it only happens on distributed vs. single shard search so it could be on both.
> For example, on a single valued field with one facet value set as a fq filter, combined with a text search on a simple term "science", the following is the facet count:
> 8,294,284
> And the total document count for the same results is:
> 8,294,274
> some debug info (not sure why the filter query is replicated more than once, but that shouldn't be harmful):
> {code}
> uerystring	  	(science)
> QParser	  	OldLuceneQParser
> filter_queries	  	[sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article"), 
> sys_content_type:("Journal Article"), sys_content_type:("Journal Article")]
> rawquerystring	  	(science)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.