You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by sikburn <si...@gmail.com> on 2013/11/08 22:56:18 UTC
ConjunctionScorer floating point precision for score()

Hello,
I have been investigating an issue with document scoring and found that the
ConjunctionScorer implements the score method in a way that can cause
floating point precision rounding issues.  I noticed in some of my test
cases that documents that have not been merged/optimized (I'm not sure of
the correct terminology, they have a docNum of 0) have scorers added in a
different order than optimized documents.  Using a float to maintain the sum
of scores introduces the potential for floating point precision errors.  In
turn this causes the score that is returned from the ConjunctionScorer to be
different for some merged/unmerged documents that should have identical
scores.

Example:

float sum1 = 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f +
0.0030929677f + 0.5010608f + 0.0061859353f;

float sum2 =  0.0061859353f + 0.0061859353f + 0.0061859353f + 0.0030929677f
+ 0.0030929677f + 0.0030929677f + 0.5010608f;

sum1 == 0.5288975; // Incorrect
sum2 == 0.52889746; // Correct

I am currently running Solr/Lucene 3.6.2 from source and have two potential
solutions, but I not an expert on floating point precision, rounding, or
lucene performance implications.  

I also noticed that there is a comment in the 4.5.1 version of Lucene to the
effect of:
// TODO: sum into a double and cast to float if we ever send required
clauses to BS1

My Questions are as follows:
Is this currently expected behavior that should not be patched?
If not, would either of these potential solutions be maintained by the
Lucene development community?

Current:
	public float score() throws IOException {
		float sum = 0.0f;
		for (int i = 0; i < scorers.length; i++) {
			sum += scorers[i].score();
		}
		return sum;
	}

Option 1:
	public float score() throws IOException {
		double sum = 0.0d;
		for (int i = 0; i < scorers.length; i++) {
			sum += scorers[i].score();
		}
		return (float)sum;
	}

Option 2:
	public float score() throws IOException {
		BigDecimal sum = new BigDecimal(0.0f);
		for (int i = 0; i < scorers.length; i++) {
			sum = sum.add(new BigDecimal(scorers[i].score()));
		}
		return sum.floatValue();
	}



--
View this message in context: http://lucene.472066.n3.nabble.com/ConjunctionScorer-floating-point-precision-for-score-tp4100051.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org