You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/11/17 01:31:41 UTC

[GitHub] [lucene-solr] jtibshirani commented on a change in pull request #2047: LUCENE-9592: Use doubles in VectorUtil to maintain precision.

jtibshirani commented on a change in pull request #2047:
URL: https://github.com/apache/lucene-solr/pull/2047#discussion_r524826537



##########
File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java
##########
@@ -25,47 +25,22 @@
   private VectorUtil() {
   }
 
-  public static float dotProduct(float[] a, float[] b) {
-    float res = 0f;
-    /*
-     * If length of vector is larger than 8, we use unrolled dot product to accelerate the
-     * calculation.
-     */
-    int i;
-    for (i = 0; i < a.length % 8; i++) {
-      res += b[i] * a[i];
-    }
-    if (a.length < 8) {
-      return res;
-    }
-    float s0 = 0f;
-    float s1 = 0f;
-    float s2 = 0f;
-    float s3 = 0f;
-    float s4 = 0f;
-    float s5 = 0f;
-    float s6 = 0f;
-    float s7 = 0f;
-    for (; i + 7 < a.length; i += 8) {
-      s0 += b[i] * a[i];
-      s1 += b[i + 1] * a[i + 1];
-      s2 += b[i + 2] * a[i + 2];
-      s3 += b[i + 3] * a[i + 3];
-      s4 += b[i + 4] * a[i + 4];
-      s5 += b[i + 5] * a[i + 5];
-      s6 += b[i + 6] * a[i + 6];
-      s7 += b[i + 7] * a[i + 7];
+  public static double dotProduct(float[] a, float[] b) {

Review comment:
       Simply changing the test to use a larger epsilon sounds good to me. After thinking about this more, I'm not sure we want to optimize for the precision of these individual calculations. Many high-dimensional vectors are already an approximation to an original object, like a piece of text. And I've heard of practitioners choosing less precise representations (like bfloat16) for each vector element to save space, and still achieving acceptable results.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org