You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/09/18 20:16:52 UTC

[GitHub] [lucene] msokolov opened a new pull request, #11784: NeighborArray is now fixed size

msokolov opened a new pull request, #11784:
URL: https://github.com/apache/lucene/pull/11784

   Remove the ability for `NeighborArray` to grow. Always allocates it one larger than requested, to reserve space for a temporary neighbor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #11784: NeighborArray is now fixed size

Posted by GitBox <gi...@apache.org>.
msokolov commented on PR #11784:
URL: https://github.com/apache/lucene/pull/11784#issuecomment-1250379348

   Fixes https://github.com/apache/lucene/issues/11783


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Re: [PR] NeighborArray is now fixed size [lucene]

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov closed pull request #11784: NeighborArray is now fixed size
URL: https://github.com/apache/lucene/pull/11784


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] danmuzi commented on a diff in pull request #11784: NeighborArray is now fixed size

Posted by GitBox <gi...@apache.org>.
danmuzi commented on code in PR #11784:
URL: https://github.com/apache/lucene/pull/11784#discussion_r978324992


##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -46,27 +45,23 @@ public NeighborArray(int maxSize, boolean descOrder) {
    * nodes.
    */
   public void add(int newNode, float newScore) {
-    if (size == node.length) {
-      node = ArrayUtil.grow(node);
-      score = ArrayUtil.growExact(score, node.length);
-    }
-    if (size > 0) {
-      float previousScore = score[size - 1];
-      assert ((scoresDescOrder && (previousScore >= newScore))
-              || (scoresDescOrder == false && (previousScore <= newScore)))
-          : "Nodes are added in the incorrect order!";
-    }
+    assert isSorted(newScore) : "Nodes are added in the incorrect order!";
     node[size] = newNode;
     score[size] = newScore;
     ++size;
   }
 
+  private boolean isSorted(float newScore) {
+    if (size > 0) {
+      float previousScore = score[size - 1];
+      return ((scoresDescOrder && (previousScore >= newScore))
+          || (scoresDescOrder == false && (previousScore <= newScore)));

Review Comment:
   It seems to be a [XNOR](https://en.wikipedia.org/wiki/XNOR_gate) operation.
   (A & B) | (!A & !B) => A == B
   So it can be changed to a simple form as follows:
   ```java
   return (previousScore == newScore) || (scoresDescOrder == (previousScore > newScore));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #11784: NeighborArray is now fixed size

Posted by GitBox <gi...@apache.org>.
msokolov commented on PR #11784:
URL: https://github.com/apache/lucene/pull/11784#issuecomment-1251146315

   Also -- now that I see this I realize that most likely we are never exercising this resize capability, so removing it won't really help performance / memory usage as I was hoping. But it still seems like a good cleanup?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Re: [PR] NeighborArray is now fixed size [lucene]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #11784:
URL: https://github.com/apache/lucene/pull/11784#issuecomment-1880904342

   This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Re: [PR] NeighborArray is now fixed size [lucene]

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #11784:
URL: https://github.com/apache/lucene/pull/11784#issuecomment-1790565433

   Wow, lots of fun discussion here, including specifics of how Java conditionals are evaluated.  @msokolov is this still relevant?  The HNSW code has been red-hot lately; maybe this change was already effectively done?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #11784: NeighborArray is now fixed size

Posted by GitBox <gi...@apache.org>.
msokolov commented on PR #11784:
URL: https://github.com/apache/lucene/pull/11784#issuecomment-1251132612

   > I was thinking a better approach would be to leave it to uses of NeighborArray to define maxSize, and not add +1 in the NeighborArray class itself as this PR suggests
   
   I guess I was thinking that since this class only has a single use, it wouldn't matter? But it definitely is better encapsulation to move the sizing logic to the place where we know how many we need. +1 to have consumers do it, especially since at least in one place they already do :) I'll follow up with a patch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mayya-sharipova commented on pull request #11784: NeighborArray is now fixed size

Posted by GitBox <gi...@apache.org>.
mayya-sharipova commented on PR #11784:
URL: https://github.com/apache/lucene/pull/11784#issuecomment-1251045793

   @msokolov Thanks for tackling this. I was also thinking to remove `NeighborArray` of resizing, which makes logic simplier.
   
   I was thinking a better approach would be to leave it to `NeighborArray` users to define `maxSize`, and not add +1  in the `NeighborArray` class itself as this PR suggests. For example, [OnHeapHnswGraph](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java#L62-L66) already adds +1 when creating `NeighborArray`.
   
   What do you think?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] danmuzi commented on a diff in pull request #11784: NeighborArray is now fixed size

Posted by GitBox <gi...@apache.org>.
danmuzi commented on code in PR #11784:
URL: https://github.com/apache/lucene/pull/11784#discussion_r978324992


##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -46,27 +45,23 @@ public NeighborArray(int maxSize, boolean descOrder) {
    * nodes.
    */
   public void add(int newNode, float newScore) {
-    if (size == node.length) {
-      node = ArrayUtil.grow(node);
-      score = ArrayUtil.growExact(score, node.length);
-    }
-    if (size > 0) {
-      float previousScore = score[size - 1];
-      assert ((scoresDescOrder && (previousScore >= newScore))
-              || (scoresDescOrder == false && (previousScore <= newScore)))
-          : "Nodes are added in the incorrect order!";
-    }
+    assert isSorted(newScore) : "Nodes are added in the incorrect order!";
     node[size] = newNode;
     score[size] = newScore;
     ++size;
   }
 
+  private boolean isSorted(float newScore) {
+    if (size > 0) {
+      float previousScore = score[size - 1];
+      return ((scoresDescOrder && (previousScore >= newScore))
+          || (scoresDescOrder == false && (previousScore <= newScore)));

Review Comment:
   It seems to be an [XNOR](https://en.wikipedia.org/wiki/XNOR_gate) operation.
   (A & B) | (!A & !B) => A == B
   So it can be changed to a simple form as follows:
   ```java
   return (previousScore == newScore) || (scoresDescOrder == (previousScore > newScore));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] danmuzi commented on a diff in pull request #11784: NeighborArray is now fixed size

Posted by GitBox <gi...@apache.org>.
danmuzi commented on code in PR #11784:
URL: https://github.com/apache/lucene/pull/11784#discussion_r978324992


##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -46,27 +45,23 @@ public NeighborArray(int maxSize, boolean descOrder) {
    * nodes.
    */
   public void add(int newNode, float newScore) {
-    if (size == node.length) {
-      node = ArrayUtil.grow(node);
-      score = ArrayUtil.growExact(score, node.length);
-    }
-    if (size > 0) {
-      float previousScore = score[size - 1];
-      assert ((scoresDescOrder && (previousScore >= newScore))
-              || (scoresDescOrder == false && (previousScore <= newScore)))
-          : "Nodes are added in the incorrect order!";
-    }
+    assert isSorted(newScore) : "Nodes are added in the incorrect order!";
     node[size] = newNode;
     score[size] = newScore;
     ++size;
   }
 
+  private boolean isSorted(float newScore) {
+    if (size > 0) {
+      float previousScore = score[size - 1];
+      return ((scoresDescOrder && (previousScore >= newScore))
+          || (scoresDescOrder == false && (previousScore <= newScore)));

Review Comment:
   It seems to be an [XNOR](https://en.wikipedia.org/wiki/XNOR_gate) operation.
   (A & B) | (!A & !B) => A == B
   So it can be changed to a simple form as follows:
   ```java
   return (previousScore == newScore) || (scoresDescOrder == (previousScore > newScore))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Re: [PR] NeighborArray is now fixed size [lucene]

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #11784:
URL: https://github.com/apache/lucene/pull/11784#issuecomment-1790576576

   Thanks @msokolov.  This looks like a nice tool, helpful for giving demos of cool Lucene features at conferences, but it looks like consensus is we should not add it to Lucene?  Maybe luceneutil instead?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org