You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Massimo Schiavon <ms...@volunia.com> on 2011/09/20 16:14:08 UTC

term frequencies in sharded environment

Seems that when I submit a query in a sharded environment the idf 
component of the scoring formula takes into consideration the local 
terms frequencies (local to the single shard index). The effect of that 
is that the calculation is correct only if the distribution terms in the 
shards is balanced.

Are there any way to avoid that? Perhaps by using cumulative frequencies 
in the calculation? Anything else?

Regards

Massimo

Re: term frequencies in sharded environment

Posted by da...@ontrenet.com.
Please see [1]

[1] https://issues.apache.org/jira/browse/SOLR-1632

On Tue, 20 Sep 2011 16:14:08 +0200, Massimo Schiavon
<ms...@volunia.com> wrote:
> Seems that when I submit a query in a sharded environment the idf 
> component of the scoring formula takes into consideration the local 
> terms frequencies (local to the single shard index). The effect of that 
> is that the calculation is correct only if the distribution terms in the

> shards is balanced.
> 
> Are there any way to avoid that? Perhaps by using cumulative frequencies

> in the calculation? Anything else?
> 
> Regards
> 
> Massimo