You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/10/06 21:27:12 UTC

[GitHub] [lucene] msokolov opened a new pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

msokolov opened a new pull request #361:
URL: https://github.com/apache/lucene/pull/361


   Adds VectorSimilarity.convertToScore and uses that to "normalize" scores in vector search() function 
   before returning in TopDocs.scoreDocs..score 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on a change in pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
msokolov commented on a change in pull request #361:
URL: https://github.com/apache/lucene/pull/361#discussion_r724204344



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java
##########
@@ -204,6 +207,64 @@ public void testScore() throws IOException {
     }
   }
 
+  public void testScoreDotProduct() throws IOException {

Review comment:
       good idea, I'll add a test

##########
File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java
##########
@@ -115,19 +115,23 @@ public static float squareDistance(float[] v1, float[] v2) {
   /**
    * Modifies the argument to be unit length, dividing by its l2-norm. IllegalArgumentException is
    * thrown for zero vectors.
+   *
+   * @return the input array after normalization
    */
-  public static void l2normalize(float[] v) {
+  public static float[] l2normalize(float[] v) {
     l2normalize(v, true);
+    return v;

Review comment:
       Hm I thought I had done that ! :) It does say "modifies the argument", and "return the input array". I'm open to suggestions for improvement, of course, but not sure what more to say




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov merged pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
msokolov merged pull request #361:
URL: https://github.com/apache/lucene/pull/361


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #361:
URL: https://github.com/apache/lucene/pull/361#discussion_r723852770



##########
File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java
##########
@@ -115,19 +115,23 @@ public static float squareDistance(float[] v1, float[] v2) {
   /**
    * Modifies the argument to be unit length, dividing by its l2-norm. IllegalArgumentException is
    * thrown for zero vectors.
+   *
+   * @return the input array after normalization
    */
-  public static void l2normalize(float[] v) {
+  public static float[] l2normalize(float[] v) {
     l2normalize(v, true);
+    return v;

Review comment:
       when the method was void, it was obvious that it would modify the input array in-place, but now thtat it returns a float[] we might want to clarify this in the docs to avoid surprises?

##########
File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java
##########
@@ -115,19 +115,23 @@ public static float squareDistance(float[] v1, float[] v2) {
   /**
    * Modifies the argument to be unit length, dividing by its l2-norm. IllegalArgumentException is
    * thrown for zero vectors.
+   *
+   * @return the input array after normalization
    */
-  public static void l2normalize(float[] v) {
+  public static float[] l2normalize(float[] v) {
     l2normalize(v, true);
+    return v;

Review comment:
       Oh :facepalm: ! Sorry for the noise!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #361:
URL: https://github.com/apache/lucene/pull/361#discussion_r723852770



##########
File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java
##########
@@ -115,19 +115,23 @@ public static float squareDistance(float[] v1, float[] v2) {
   /**
    * Modifies the argument to be unit length, dividing by its l2-norm. IllegalArgumentException is
    * thrown for zero vectors.
+   *
+   * @return the input array after normalization
    */
-  public static void l2normalize(float[] v) {
+  public static float[] l2normalize(float[] v) {
     l2normalize(v, true);
+    return v;

Review comment:
       when the method was void, it was obvious that it would modify the input array in-place, but now thtat it returns a float[] we might want to clarify this in the docs to avoid surprises?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov merged pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
msokolov merged pull request #361:
URL: https://github.com/apache/lucene/pull/361


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on a change in pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on a change in pull request #361:
URL: https://github.com/apache/lucene/pull/361#discussion_r723760893



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java
##########
@@ -204,6 +207,64 @@ public void testScore() throws IOException {
     }
   }
 
+  public void testScoreDotProduct() throws IOException {

Review comment:
       It'd be nice to have a simple test that reproduces the negative score issue (using normalized vectors with some negative components).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on a change in pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on a change in pull request #361:
URL: https://github.com/apache/lucene/pull/361#discussion_r723760893



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java
##########
@@ -204,6 +207,64 @@ public void testScore() throws IOException {
     }
   }
 
+  public void testScoreDotProduct() throws IOException {

Review comment:
       It'd be nice to have a simple test that reproduces the negative score issue (using normalized vectors with some negative components).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #361:
URL: https://github.com/apache/lucene/pull/361#discussion_r724238874



##########
File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java
##########
@@ -115,19 +115,23 @@ public static float squareDistance(float[] v1, float[] v2) {
   /**
    * Modifies the argument to be unit length, dividing by its l2-norm. IllegalArgumentException is
    * thrown for zero vectors.
+   *
+   * @return the input array after normalization
    */
-  public static void l2normalize(float[] v) {
+  public static float[] l2normalize(float[] v) {
     l2normalize(v, true);
+    return v;

Review comment:
       Oh :facepalm: ! Sorry for the noise!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on a change in pull request #361: LUCENE-10147: ensure that KnnVectorQuery scores are positive

Posted by GitBox <gi...@apache.org>.
msokolov commented on a change in pull request #361:
URL: https://github.com/apache/lucene/pull/361#discussion_r724204344



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java
##########
@@ -204,6 +207,64 @@ public void testScore() throws IOException {
     }
   }
 
+  public void testScoreDotProduct() throws IOException {

Review comment:
       good idea, I'll add a test

##########
File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java
##########
@@ -115,19 +115,23 @@ public static float squareDistance(float[] v1, float[] v2) {
   /**
    * Modifies the argument to be unit length, dividing by its l2-norm. IllegalArgumentException is
    * thrown for zero vectors.
+   *
+   * @return the input array after normalization
    */
-  public static void l2normalize(float[] v) {
+  public static float[] l2normalize(float[] v) {
     l2normalize(v, true);
+    return v;

Review comment:
       Hm I thought I had done that ! :) It does say "modifies the argument", and "return the input array". I'm open to suggestions for improvement, of course, but not sure what more to say




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org