You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/09/02 13:03:11 UTC

[GitHub] [lucene] mayya-sharipova opened a new pull request #277: LUCENE-10040 Correct TestHnswGraph.testSearchWithAcceptOrds

mayya-sharipova opened a new pull request #277:
URL: https://github.com/apache/lucene/pull/277


   If we set numSeed = 10, this test fails sometimes  because it may mark
   expected results docs (from 0 to 9) as deleted which don't end up
   being retrieved, resulting in a lower than expected recall.
   
   - set numSeed to 10 to ensure 10 results are returned
   - add startIndex parameter in  `createRandomAcceptOrds` that allows
     documents before startIndex to be NOT deleted
   - in testSearchWithAcceptOrds use startIndex 
      equal to 10 for createRandomAcceptOrds 
   
   Relates to #239


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mayya-sharipova merged pull request #277: LUCENE-10040 Correct TestHnswGraph.testSearchWithAcceptOrds

Posted by GitBox <gi...@apache.org>.
mayya-sharipova merged pull request #277:
URL: https://github.com/apache/lucene/pull/277


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mayya-sharipova commented on a change in pull request #277: LUCENE-10040 Correct TestHnswGraph.testSearchWithAcceptOrds

Posted by GitBox <gi...@apache.org>.
mayya-sharipova commented on a change in pull request #277:
URL: https://github.com/apache/lucene/pull/277#discussion_r702809122



##########
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/TestHnswGraph.java
##########
@@ -167,24 +167,27 @@ public void testSearchWithAcceptOrds() throws IOException {
             vectors, VectorSimilarityFunction.DOT_PRODUCT, 16, 100, random().nextInt());
     HnswGraph hnsw = builder.build(vectors);
 
-    Bits acceptOrds = createRandomAcceptOrds(vectors.size);
+    // the first 10 docs must not be deleted to ensure the expected recall
+    Bits acceptOrds = createRandomAcceptOrds(10, vectors.size);
     NeighborQueue nn =
         HnswGraph.search(
             new float[] {1, 0},
             10,
-            5,
+            10,

Review comment:
       Addressed in 35ee89af212541a10138e3b0aa3b5c9de1c92aae




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mayya-sharipova commented on a change in pull request #277: LUCENE-10040 Correct TestHnswGraph.testSearchWithAcceptOrds

Posted by GitBox <gi...@apache.org>.
mayya-sharipova commented on a change in pull request #277:
URL: https://github.com/apache/lucene/pull/277#discussion_r702808133



##########
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/TestHnswGraph.java
##########
@@ -493,9 +496,14 @@ public RandomVectorValues copy() {
   }
 
   /** Generate a random bitset where each entry has a 2/3 probability of being set. */
-  private static Bits createRandomAcceptOrds(int length) {
+  private static Bits createRandomAcceptOrds(int startIndex, int length) {

Review comment:
       Addressed in 35ee89af212541a10138e3b0aa3b5c9de1c92aae




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mayya-sharipova commented on a change in pull request #277: LUCENE-10040 Correct TestHnswGraph.testSearchWithAcceptOrds

Posted by GitBox <gi...@apache.org>.
mayya-sharipova commented on a change in pull request #277:
URL: https://github.com/apache/lucene/pull/277#discussion_r702808708



##########
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/TestHnswGraph.java
##########
@@ -167,24 +167,27 @@ public void testSearchWithAcceptOrds() throws IOException {
             vectors, VectorSimilarityFunction.DOT_PRODUCT, 16, 100, random().nextInt());
     HnswGraph hnsw = builder.build(vectors);
 
-    Bits acceptOrds = createRandomAcceptOrds(vectors.size);
+    // the first 10 docs must not be deleted to ensure the expected recall
+    Bits acceptOrds = createRandomAcceptOrds(10, vectors.size);
     NeighborQueue nn =
         HnswGraph.search(
             new float[] {1, 0},
             10,
-            5,
+            10,
             vectors.randomAccess(),
             VectorSimilarityFunction.DOT_PRODUCT,
             hnsw,
             acceptOrds,
             random());
     int sum = 0;
-    for (int node : nn.nodes()) {
+    int[] nodes = nn.nodes();
+    assert (nodes.length == 10);

Review comment:
       Addressed in 35ee89af212541a10138e3b0aa3b5c9de1c92aae




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on a change in pull request #277: LUCENE-10040 Correct TestHnswGraph.testSearchWithAcceptOrds

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on a change in pull request #277:
URL: https://github.com/apache/lucene/pull/277#discussion_r702202529



##########
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/TestHnswGraph.java
##########
@@ -493,9 +496,14 @@ public RandomVectorValues copy() {
   }
 
   /** Generate a random bitset where each entry has a 2/3 probability of being set. */
-  private static Bits createRandomAcceptOrds(int length) {
+  private static Bits createRandomAcceptOrds(int startIndex, int length) {

Review comment:
       I think the method comment could use an update, now that we have `startIndex`.

##########
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/TestHnswGraph.java
##########
@@ -167,24 +167,27 @@ public void testSearchWithAcceptOrds() throws IOException {
             vectors, VectorSimilarityFunction.DOT_PRODUCT, 16, 100, random().nextInt());
     HnswGraph hnsw = builder.build(vectors);
 
-    Bits acceptOrds = createRandomAcceptOrds(vectors.size);
+    // the first 10 docs must not be deleted to ensure the expected recall
+    Bits acceptOrds = createRandomAcceptOrds(10, vectors.size);
     NeighborQueue nn =
         HnswGraph.search(
             new float[] {1, 0},
             10,
-            5,
+            10,
             vectors.randomAccess(),
             VectorSimilarityFunction.DOT_PRODUCT,
             hnsw,
             acceptOrds,
             random());
     int sum = 0;
-    for (int node : nn.nodes()) {
+    int[] nodes = nn.nodes();
+    assert (nodes.length == 10);

Review comment:
       Maybe this could be a test assertion instead of an `assert`.

##########
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/TestHnswGraph.java
##########
@@ -167,24 +167,27 @@ public void testSearchWithAcceptOrds() throws IOException {
             vectors, VectorSimilarityFunction.DOT_PRODUCT, 16, 100, random().nextInt());
     HnswGraph hnsw = builder.build(vectors);
 
-    Bits acceptOrds = createRandomAcceptOrds(vectors.size);
+    // the first 10 docs must not be deleted to ensure the expected recall
+    Bits acceptOrds = createRandomAcceptOrds(10, vectors.size);
     NeighborQueue nn =
         HnswGraph.search(
             new float[] {1, 0},
             10,
-            5,
+            10,

Review comment:
       It'd be nice to fix the other places in this test that use 5 instead of 10. From the comments like "the lowest docIds are closest to zero; sum(0,9) = 45", I'm guessing it was an accident too.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org