You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/05/02 17:40:06 UTC

[GitHub] [lucene] msokolov commented on a diff in pull request #862: LUCENE-9848 Sort HNSW graph neighbors for construction

msokolov commented on code in PR #862:
URL: https://github.com/apache/lucene/pull/862#discussion_r863041170


##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -21,32 +21,64 @@
 
 /**
  * NeighborArray encodes the neighbors of a node and their mutual scores in the HNSW graph as a pair
- * of growable arrays.
+ * of growable arrays. Nodes are arranged in the sorted order of their scores in descending order
+ * (if scoresDescOrder is true), or in the ascending order of their scores (if scoresDescOrder is
+ * false)
  *
  * @lucene.internal
  */
 public class NeighborArray {
-
+  private final boolean scoresDescOrder;
   private int size;
 
   float[] score;
   int[] node;
 
-  public NeighborArray(int maxSize) {
+  public NeighborArray(int maxSize, boolean descOrder) {
     node = new int[maxSize];
     score = new float[maxSize];
+    this.scoresDescOrder = descOrder;
   }
 
+  /**
+   * Add a new node to the NeighborArray. The new node must be worse than all previously stored
+   * nodes.
+   */
   public void add(int newNode, float newScore) {
     if (size == node.length - 1) {
       node = ArrayUtil.grow(node, (size + 1) * 3 / 2);
       score = ArrayUtil.growExact(score, node.length);
     }
+    if (size > 0) {
+      float previousScore = score[size - 1];
+      assert ((scoresDescOrder && (previousScore >= newScore))
+              || (scoresDescOrder == false && (previousScore <= newScore)))
+          : "Nodes are added in the incorrect order!";
+    }
     node[size] = newNode;
     score[size] = newScore;
     ++size;
   }
 
+  /** Add a new node to the NeighborArray into a correct sort position according to its score. */
+  public void addAndSort(int newNode, float newScore) {
+    if (size == node.length - 1) {
+      node = ArrayUtil.grow(node, (size + 1) * 3 / 2);
+      score = ArrayUtil.growExact(score, node.length);
+    }
+    int insertionPoint =
+        scoresDescOrder
+            ? descSortFindRightMostInsertionPoint(newScore)
+            : ascSortFindRightMostInsertionPoint(newScore);
+    for (int i = size; i > insertionPoint; i--) {

Review Comment:
   Can we we use `System.arraycopy`? Does it work in reverse order when it needs to?



##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -72,8 +104,38 @@ public void removeLast() {
     size--;
   }
 
+  public void removeIndex(int idx) {
+    for (int i = idx; i < (size - 1); i++) {
+      node[i] = node[i + 1];

Review Comment:
   `System.arraycopy` should definitely be safe here



##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -72,8 +104,38 @@ public void removeLast() {
     size--;
   }
 
+  public void removeIndex(int idx) {
+    for (int i = idx; i < (size - 1); i++) {
+      node[i] = node[i + 1];
+      score[i] = score[i + 1];
+    }
+    size--;
+  }
+
   @Override
   public String toString() {
     return "NeighborArray[" + size + "]";
   }
+
+  private int ascSortFindRightMostInsertionPoint(float newScore) {
+    int start = 0;

Review Comment:
   I think we can use `Arrays.binarySearch`, although sadly not for the inverse sort case



##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -21,32 +21,64 @@
 
 /**
  * NeighborArray encodes the neighbors of a node and their mutual scores in the HNSW graph as a pair
- * of growable arrays.
+ * of growable arrays. Nodes are arranged in the sorted order of their scores in descending order
+ * (if scoresDescOrder is true), or in the ascending order of their scores (if scoresDescOrder is
+ * false)
  *
  * @lucene.internal
  */
 public class NeighborArray {
-
+  private final boolean scoresDescOrder;
   private int size;
 
   float[] score;
   int[] node;
 
-  public NeighborArray(int maxSize) {
+  public NeighborArray(int maxSize, boolean descOrder) {
     node = new int[maxSize];
     score = new float[maxSize];
+    this.scoresDescOrder = descOrder;
   }
 
+  /**
+   * Add a new node to the NeighborArray. The new node must be worse than all previously stored
+   * nodes.
+   */
   public void add(int newNode, float newScore) {
     if (size == node.length - 1) {
       node = ArrayUtil.grow(node, (size + 1) * 3 / 2);
       score = ArrayUtil.growExact(score, node.length);
     }
+    if (size > 0) {
+      float previousScore = score[size - 1];
+      assert ((scoresDescOrder && (previousScore >= newScore))
+              || (scoresDescOrder == false && (previousScore <= newScore)))
+          : "Nodes are added in the incorrect order!";
+    }
     node[size] = newNode;
     score[size] = newScore;
     ++size;
   }
 
+  /** Add a new node to the NeighborArray into a correct sort position according to its score. */
+  public void addAndSort(int newNode, float newScore) {

Review Comment:
   I might rename this `insertSorted` since it is not really re-sorting?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org