You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ChrisHegarty (via GitHub)" <gi...@apache.org> on 2023/05/24 11:52:18 UTC

[GitHub] [lucene] ChrisHegarty opened a new pull request, #12311: Integrate the Incubating Panama Vector API

ChrisHegarty opened a new pull request, #12311:
URL: https://github.com/apache/lucene/pull/12311

   Leverage accelerated vector hardware instructions in Vector Search.
   
   Lucene already has a mechanism that enables the use of non-final JDK APIs, currently used for the Previewing Pamana Foreign API. This change expands this mechanism to include the Incubating Pamana Vector API. When the `jdk.incubator.vector` module is present at run time the Panamaized version of the low-level primitives used by Vector Search is enabled. If not present, the default scalar version of these low-level primitives is used (as it was previously).
   
   Currently, we're only targeting support for JDK 20. A subsequent PR should evaluate JDK 21, which is still in development.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555947416

   @uschindler - Thanks for all the work, cleanup, log messages, etc. Looks great.
   
   Here's what I'm hoping to get to, might be tomorrow at this stage.
   
   1. I'm in the process of preparing a luceneutil run - still downloading on my Linux box. Then I'll try to get some comparative numbers from the benchmark - I'm not quite sure exactly what to do here, but I've run this before a while back so I'll figure it out.
   
   2. Run JMH microbench to compare and validate slight modifications of the Panama-ized dotProduct - as mentioned by Robert.
   
   I'll post results when I have them, but I probably will not be today.  ( I'm also quite time limited early next week )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556285923

   With latest commits to that vectorbench I see this on my m1:
   ```
   Benchmark                             (size)   Mode  Cnt   Score   Error   Units
   DotProductBenchmark.dotProductNew       1024  thrpt    5   9.463 ± 0.004  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5  16.106 ± 0.048  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5   3.828 ± 0.003  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199048935


##########
lucene/core/src/java/org/apache/lucene/internal/vector/DefaultVectorUtilProvider.java:
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.internal.vector;
+
+/**
+ * The default VectorUtil provider implementation.
+ *
+ * @lucene.internal
+ */
+public final class DefaultVectorUtilProvider implements VectorUtilProvider {

Review Comment:
   IMHO, this one should be package private like the Java 20 one.
   
   In general I am not really happy with the additional package. I know you are a fan of the module system, but most users still don't use it and our Javadocs also show all packages (including internal ones).
   
   Is it really needed to have the vector stuff in a separate package. For MMAPDirectory I made sure to hide everything, including the provider as it is no public API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555188725

   > > In addition at some point we should rename the files, but thats not urgent because naming is not so important. We should then also rename the extraction gradle script, as it will be used not only for panama-foreign. But this is cosmetic only.
   > 
   > Agreed. I had a similar thought when starting out on this, but avoided it then because of the "noise" - but it's certainly worth doing now. I'll leave this to you @uschindler, unless I hear otherwise. Thanks.
   
   Yes. Let's wait with doing this. It's just cosmetic and would conflict with the other open PR.
   
   Let me first remove useless classes from jar file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199053205


##########
lucene/core/src/java/org/apache/lucene/internal/vector/VectorUtilProvider.java:
##########
@@ -76,4 +77,10 @@ static boolean vectorModulePresentAndReadable() {
     }
     return false;
   }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  @SuppressForbidden(reason = "required to determine if non-workable locale")
+  static boolean useVectorAPI() {
+    return 'I' == int.class.getSimpleName().toUpperCase().charAt(0);

Review Comment:
   this confused me at first, it's really funny.
   
   How about a simpler: `return Objects.equals('I', "i".toUpperCase());`



##########
lucene/core/src/java/org/apache/lucene/internal/vector/VectorUtilProvider.java:
##########
@@ -76,4 +77,10 @@ static boolean vectorModulePresentAndReadable() {
     }
     return false;
   }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  @SuppressForbidden(reason = "required to determine if non-workable locale")
+  static boolean useVectorAPI() {
+    return 'I' == int.class.getSimpleName().toUpperCase().charAt(0);

Review Comment:
   this confused me at first, it's really funny.
   
   How about a simpler: `return Objects.equals("I", "i".toUpperCase());`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1554742703

   In addition, as we do not implement java 19 vector support yet, I would add some code to don't extract it dependning on java version. So we can control separately which of the 2 api areas (foreign / vector) are extracted per version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555877992

   Would also no interstin to run Mike's benchmark (the vector part).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559566170

   > Agreed. I don't think we care enough to try this - what's the point. Pushed [74a9782](https://github.com/apache/lucene/pull/12311/commits/74a9782240dc3b0eb99f19c324be304117371e46)
   
   Cool thanks. I was about to do the same, my idea was a bit different: Add a normal virtual method "isSupported()" to the interface and implement it returning true for default provider but returning something depending on vector size for panama provider. This would spare two times doing the additional reflection using the lookup. The lookup function would only return the panama provider if it returns true.
   
   Another approach is to use Lookup#findStaticVarHandle() on `INT_SPECIES_PREF_BIT_SIZE` and read it. This spares catching `Throwable`, which is one of the things I hate about method handles.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561976995

   Hi, I added spec conformant support for superclasses/interfaces to extractor (recursively collecting classes).
   I also added a changes.txt and merged main branch.
   
   Please have a final look and review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1203652972


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/** A provider of VectorUtil implementations. */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  /** Returns the cosine similarity between the two vectors. */
+  float cosine(float[] v1, float[] v2);
+
+  /** Returns the sum of squared differences of the two vectors. */
+  float squareDistance(float[] a, float[] b);
+
+  /** Returns the dot product computed over signed bytes. */
+  int dotProduct(byte[] a, byte[] b);
+
+  /** Returns the cosine similarity between the two byte vectors. */
+  float cosine(byte[] a, byte[] b);
+
+  /** Returns the sum of squared differences of the two byte vectors. */
+  int squareDistance(byte[] a, byte[] b);
+
+  // -- provider lookup mechanism
+
+  static final Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) {
+      // is locale sane (only buggy in Java 20)
+      if (runtimeVersion <= 20 && !hasWorkingDefaultLocale()) {
+        LOG.warning(
+            "Java runtime is using a buggy default locale; Java vector incubator API can't be enabled: "
+                + Locale.getDefault());
+        return new VectorUtilDefaultProvider();
+      }
+      // is the incubator module present and readable (JVM providers may to exclude them or it is
+      // build with jlink)
+      if (!vectorModulePresentAndReadable()) {
+        LOG.warning(
+            "Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.");
+        return new VectorUtilDefaultProvider();
+      }
+      if (isClientVM()) {
+        LOG.warning("C2 compiler is disabled; Java vector incubator API can't be enabled");
+        return new VectorUtilDefaultProvider();
+      }
+      try {
+        // we use method handles with lookup, so we do not need to deal with setAccessible as we
+        // have private access through the lookup:
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (UnsupportedOperationException uoe) {
+          // not supported because preferred vector size too small or similar
+          LOG.warning("Java vector incubator API was not enabled. " + uoe.getMessage());
+          return new VectorUtilDefaultProvider();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "VectorUtilPanamaProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("VectorUtilPanamaProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  private static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  private static boolean hasWorkingDefaultLocale() {
+    return Runtime.version().update() > 1

Review Comment:
   Thanks @uschindler 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561155494

   Now on "QEMU virtual cpu" or similar with just SSE and nothing fancy, you don't get a crazy slowdown for integer functions but still enjoy a nice speedup for floating point:
   
   ```
   Benchmark                               (size)   Mode  Cnt  Score   Error   Units
   FloatCosineBenchmark.cosineNew            1024  thrpt    5  3.443 ± 0.022  ops/us
   FloatCosineBenchmark.cosineOld            1024  thrpt    5  0.686 ± 0.046  ops/us
   FloatDotProductBenchmark.dotProductNew    1024  thrpt    5  8.155 ± 0.087  ops/us
   FloatDotProductBenchmark.dotProductOld    1024  thrpt    5  1.915 ± 0.020  ops/us
   FloatSquareBenchmark.squareNew            1024  thrpt    5  6.507 ± 0.192  ops/us
   FloatSquareBenchmark.squareOld            1024  thrpt    5  1.115 ± 0.014  ops/us
   ```
   
   ```
   flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc nopl xtopology cpuid tsc_known_freq pni cx16 x2apic hypervisor lahf_lm 
   ```
   
   Again I press hard on this because I see such virtualization configurations all the time (besides just being QEMU's default). Often the use-case is to support VM migrations across different hardware: that's why it defaults to restricted set of features.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561251934

   Sanity with Robert's results. ( no surprises )
   
   ```
   $ jdk-20.0.1/bin/java -XX:UseAVX=0  -jar target/vectorbench.jar  -psize=1024
   
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   1.212 ± 0.002  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   1.212 ± 0.001  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   3.428 ± 0.010  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   3.228 ± 0.020  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   2.597 ± 0.006  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   2.608 ± 0.009  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   5.350 ± 0.003  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   1.073 ± 0.002  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  14.443 ± 0.011  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   3.742 ± 0.005  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  10.316 ± 0.009  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   2.655 ± 0.002  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198252072


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##########
@@ -270,4 +216,134 @@ public static float dotProductScore(byte[] a, byte[] b) {
     float denom = (float) (a.length * (1 << 15));
     return 0.5f + dotProduct(a, b) / denom;
   }
+
+  interface VectorUtilProvider {
+
+    // just dot product for now
+    float dotProduct(float[] a, float[] b);
+  }
+
+  private static VectorUtilProvider lookupProvider() {
+    // TODO: add a check
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) { // TODO: do we want JDK 19?
+      try {
+        ensureReadability();
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.JDKVectorUtilProvider");
+        // we use method handles, so we do not need to deal with setAccessible as we have private
+        // access through the lookup:
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError("JDKVectorUtilProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("JDKVectorUtilProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new LuceneVectorUtilProvider();
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static <T> T doPrivileged(PrivilegedAction<T> action) {
+    return AccessController.doPrivileged(action);
+  }
+
+  static void ensureReadability() {
+    ModuleLayer.boot().modules().stream()
+        .filter(m -> m.getName().equals("jdk.incubator.vector"))
+        .findFirst()
+        .ifPresentOrElse(
+            vecMod -> VectorUtilProvider.class.getModule().addReads(vecMod),
+            () -> LOG.warning("vector incubator module not present"));
+  }
+
+  static {
+    PROVIDER =

Review Comment:
   doPriv is not needed, removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557639326

   btw, i think there's a real bug in the SPECIES_PREFERRED stuff that makes testing such degenerate cases *really difficult*. You should be able to just pass `-XX:MaxVectorSize=8` or `-XX:UseAVX=0` or similar to test it out: but this won't change SPECIES_PREFERRED, only make it dog slow. its like it gets the wrong information from the compiler.
   
   the only way i know right now is to spin up QEMU and take cpu flags away :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] alessandrobenedetti commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "alessandrobenedetti (via GitHub)" <gi...@apache.org>.
alessandrobenedetti commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564575142

   thanks @uschindler for the explanation, I appreciate the work you are doing! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1575497458

   Thanks @markmiller.
   
   Does the Panama API give us any insight into the underlying capabilities of the CPU?  Not just which versions/widths of SIMD instructions are supported (in a general cross-platform sort of way I guess), but also how much "true" concurrency of these SIMD instructions is supported?  It seems like an important detail of the bare metal that should somehow be accessible up in the clouds of javaland... e.g. it would inform how much intra-query concurrency an app should try to use with KNN queries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1575498632

   > Does the Panama API give us any insight into the underlying capabilities of the CPU? Not just which versions/widths of SIMD instructions are supported (in a general cross-platform sort of way I guess), but also how much "true" concurrency of these SIMD instructions is supported?
   
   Unfortunately, no. We get the preferred species size and number of lanes.
   
   But everything that does not work on hardware is emulated in damn slow interpreted code.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199035703


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,13 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
+
+      // Disable assertions to workaround JDK-8301190
+      jvmArgs '-da:jdk.incubator.vector.LaneType'

Review Comment:
   I went with option 2 - fall back to scalar impl (e.g. pretend vector api is not enabled) if the user has Turkish or Azeri locale and jdk version < 21.
   
   Clearly we can revisit this if it turns out to be problematic.  I'm hoping not - and we should be able to do something smarter for JDK 20.0.2, which has this JDK bug fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1203631144


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/** A provider of VectorUtil implementations. */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  /** Returns the cosine similarity between the two vectors. */
+  float cosine(float[] v1, float[] v2);
+
+  /** Returns the sum of squared differences of the two vectors. */
+  float squareDistance(float[] a, float[] b);
+
+  /** Returns the dot product computed over signed bytes. */
+  int dotProduct(byte[] a, byte[] b);
+
+  /** Returns the cosine similarity between the two byte vectors. */
+  float cosine(byte[] a, byte[] b);
+
+  /** Returns the sum of squared differences of the two byte vectors. */
+  int squareDistance(byte[] a, byte[] b);
+
+  // -- provider lookup mechanism
+
+  static final Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) {
+      // is locale sane (only buggy in Java 20)
+      if (runtimeVersion <= 20 && !hasWorkingDefaultLocale()) {
+        LOG.warning(
+            "Java runtime is using a buggy default locale; Java vector incubator API can't be enabled: "
+                + Locale.getDefault());
+        return new VectorUtilDefaultProvider();
+      }
+      // is the incubator module present and readable (JVM providers may to exclude them or it is
+      // build with jlink)
+      if (!vectorModulePresentAndReadable()) {
+        LOG.warning(
+            "Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.");
+        return new VectorUtilDefaultProvider();
+      }
+      if (isClientVM()) {
+        LOG.warning("C2 compiler is disabled; Java vector incubator API can't be enabled");
+        return new VectorUtilDefaultProvider();
+      }
+      try {
+        // we use method handles with lookup, so we do not need to deal with setAccessible as we
+        // have private access through the lookup:
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (UnsupportedOperationException uoe) {
+          // not supported because preferred vector size too small or similar
+          LOG.warning("Java vector incubator API was not enabled. " + uoe.getMessage());
+          return new VectorUtilDefaultProvider();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "VectorUtilPanamaProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("VectorUtilPanamaProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  private static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  private static boolean hasWorkingDefaultLocale() {
+    return Runtime.version().update() > 1

Review Comment:
   The motivation for colocating the update version check and the hacky locale check, is that both are strictly related. I would hope that we can simply remove this method at some point.  I don't see how things would be better if we moved this up, but again I'm not tied to this particular structure, just that things will work correctly on 20.0.2+ without the need for the locale check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1562944432

   Hmm should we add the generated Panama code to `.gitignore`?  My `./gradlew precommit` is failing with:
   
   ```
   * What went wrong:
   Execution failed for task ':checkWorkingCopyClean'.
   > Working copy is not a clean git checkout (skip with -Pvalidation.git.failOnModified=false), offending files:
       - gradle/generation/panama-foreign (untracked non-empty dir)
       - gradle/generation/panama-foreign/ExtractForeignAPI.java (untracked)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1563412483

   Was this dome kind of VM? 
   
   Anyways for larger vectors it should run for longer time to get enough optimization rounds. The problem with the vector API is that it needs much longer to get "warm". If you measure only a few seconds on each JVM it just measures the warmup...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198253504


##########
gradle/generation/panama-foreign.gradle:
##########
@@ -45,13 +45,14 @@ configure(project(":lucene:core")) {
           javaLauncher.get()
           return true
         } catch (Exception e) {
-          logger.warn('Launcher for Java {} is not available; skipping regeneration of Panama Foreign API JAR.', jdkVersion)
+          logger.warn('Launcher for Java {} is not available; skipping regeneration of Panama Foreign & Vector API JAR.', jdkVersion)
           logger.warn('Error: {}', e.cause?.message)
           logger.warn("Please make sure to point env 'JAVA{}_HOME' to exactly JDK version {} or enable Gradle toolchain auto-download.", jdkVersion, jdkVersion)
           return false
         }
       }
-      
+
+      jvmArgs = ["--add-modules=jdk.incubator.vector"]

Review Comment:
   Ah yes. Good catch. Removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555895724

   There's something strange on my computer when running core tests with Java 20:
   
   :lucene:core:test (SUCCESS): 5730 test(s), 193 skipped
   The slowest tests (exceeding 500 ms) during this run:
     236.22s TestHnswFloatVectorGraph.testRamUsageEstimate (:lucene:core)
   
   Uwe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201816458


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   Policeman jenkins always specifies the `tests.args` gradle property, so we could remove the `--add-modules` here. Policeman Jenkins could even go further and run Java 20+ with additional randomization somestimes with vectors enabled (so just use the `--add-modules jdk.incubator.vector` as an additional randomized setting together with Garbage Collector, OOP width,...).
   
   We should still add a separate property for developers like `-Ptests.incubator.vectors=true` which then also enables C2 and the module.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201940561


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   I think we can duplicate the single test like we do "gradlew beast". It also creates clones of test task.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560967020

   I started a conversation on panama-dev, initially starting with the shape of the preferred species, when combined with various command line flags. I deliberately keep this focused on `UseAVX=0`, I would like to get an understand on that before moving on C2, etc.
   
   https://mail.openjdk.org/pipermail/panama-dev/2023-May/019072.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561025538

   we could even restrict the guard for the integer math to require 256 bits only for the  `os.arch=amd64` case. This way 128-bit vectors still work for the binary functions on ARM. Maybe this is more amenable to Uwe than checking UseAVX flag.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557691360

   > btw, i think there's a real bug in the SPECIES_PREFERRED stuff that makes testing such degenerate cases _really difficult_. You should be able to just pass `-XX:MaxVectorSize=8` or `-XX:UseAVX=0` or similar to test it out: but this won't change SPECIES_PREFERRED, only make it dog slow. its like it gets the wrong information from the compiler.
   > 
   > the only way i know right now is to spin up QEMU and take cpu flags away :)
   
   Let's open a big report. I can do it in the openjdk bug tracker.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559872383

   Sorry ignore my last comment. The power of 2 was wrong.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561281951

   thank you for benchmarking @msokolov 
   
   I will say, I am primarily concerned about the indexing speed, not search speed. that's the current pain point with high dimensional vectors IMO. Luceneutil hides it by only using 100 dims, but I have experienced it for myself, and it is miserable.
   
   So I'm really hoping this can reduce the pain there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559489000

   > disable panama provider unless user has at least 128-bit vectors? Nobody has benched the case where SPECIES_PREFERRED is 64-bits: that's the case where vectorization isnt enabled at all for some reason (VM, unsupported platform). I am suspicious it will do any good at all for these algorithms. we can try to test it with QEMU if we really care.
   
   Agreed. I don't think we care enough to try this - what's the point. Pushed [74a9782](https://github.com/apache/lucene/pull/12311/commits/74a9782240dc3b0eb99f19c324be304117371e46)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561532363

   +1 I think we should try to get this merged in with just these simple vector search functions.
   
   these are the simplest of what we probably want to do, and look at all the crazy workarounds we had to do to prevent massive performance traps. 
   
   I'm terrified of trying to vectorize postings list decode or other parts of lucene that might require more exotic functions, shuffles, etc, with no easy way to detect if they will be horribly slow or not. We should leave any of that kind of stuff to additional PR, more of a research project in my view.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556286815

   With the latest commit on vectorbench, aad7b47, I see
   
   
   Linux-x64, Intel Rocket Lake: 11th Gen Intel Core i5-11400 @ 2.60GHz
   ```
   Benchmark                             (size)   Mode  Cnt   Score   Error   Units
   DotProductBenchmark.dotProductNew       1024  thrpt    5  21.460 ± 0.014  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5  24.994 ± 0.032  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5   3.481 ± 0.003  ops/us
   ```
   
   Apple M1
   ```
   Benchmark                             (size)   Mode  Cnt   Score   Error   Units
   DotProductBenchmark.dotProductNew       1024  thrpt    5   9.248 ± 0.021  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5  15.273 ± 0.662  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5   3.697 ± 0.071  ops/us
   ```
   
   NewNew seems to be the one! :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557119416

   When I had tried this before I did something like:
   
   ```
   +    ShortVector acc = ShortVector.zero(SHORT_SPECIES);
   +    int l = 0;
   +    for (; l < BYTE_SPECIES.loopBound(len); l += BYTE_SPECIES.length()) {
   +      ByteVector va = ByteVector.fromArray(BYTE_SPECIES, a.bytes, aOffset + l);
   +      ByteVector vb = ByteVector.fromArray(BYTE_SPECIES, b.bytes, bOffset + l);
   +
   +      Vector<Short> vas = va.castShape(SHORT_SPECIES, 0);
   +      Vector<Short> vbs = vb.castShape(SHORT_SPECIES, 0);
   +      acc = acc.add(vas.mul(vbs));
   +
   +      vas = va.castShape(SHORT_SPECIES, 1);
   +      vbs = vb.castShape(SHORT_SPECIES, 1);
   +      acc = acc.add(vas.mul(vbs));
   +    }
   +    long total = acc.reduceLanesToLong(VectorOperators.ADD);
   ```
   
   I don't have perf numbers any more - no idea whether this is better than what you have already - probably not, but it might be worth trying castShape?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559308415

   > Anybody with a real Luceneutil vector-benchmark of the whole luceneutil testsuite?
   
   I think the benefits won't look very impressive because it only uses 100 dims. This seems to be out of touch with people screaming for 2048 :)
   
   Also when using such non-power-of-two sizes, you can expect perf to not be as good. e.g. 1023 dims will be much slower than 1024.
   
   on a 256-bit avx, with float dot product on 1023 dims, we'll work 32 floats at a time (the very fast loop with 4 accumulators), then there's 31 still left over, we'll knock out 24 of them with a slower vector loop with a single accumulator, then the remaining 7 are processed scalar. With 1024 dims, they all just get processed with the very fast loop and nothing is left over.
   
   you can see this stuff in benchmarks above where e.g. 128 dims is faster than 100, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] markrmiller commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "markrmiller (via GitHub)" <gi...@apache.org>.
markrmiller commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1575017272

   They killed AVX-512 on consumer hardware a while back. And they really
   don’t want you to have it. First they killed it and the BIOS makers figured
   out how you could enable it anyway, and so then they hardware killed it.
   Brilliant move that I bet is just funneling business to their server CPUs
   hand over fist.
   
   On Sat, Jun 3, 2023 at 9:33 AM Michael McCandless ***@***.***>
   wrote:
   
   > I ran @rmuir <https://github.com/rmuir>'s vectorbench on a new Raptor
   > Lake build (i9-13900K).
   >
   > Note that this CPU does NOT seem to support AVX-512:
   >
   > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi umip pku ospke waitpkg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize pconfig arch_lbr ibt flush_l1d arch_capabilities
   > vmx flags       : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs ept_mode_based_exec tsc_scaling usr_wait_pause
   >
   > Benchy results:
   >
   > Benchmark                                (size)   Mode  Cnt    Score    Error   Units
   > BinaryCosineBenchmark.cosineDistanceNew       1  thrpt    5  128.894 ±  0.021  ops/us
   > BinaryCosineBenchmark.cosineDistanceNew     128  thrpt    5   64.367 ±  0.062  ops/us
   > BinaryCosineBenchmark.cosineDistanceNew     207  thrpt    5   37.563 ±  0.030  ops/us
   > BinaryCosineBenchmark.cosineDistanceNew     256  thrpt    5   36.636 ±  0.004  ops/us
   > BinaryCosineBenchmark.cosineDistanceNew     300  thrpt    5   31.305 ±  0.012  ops/us
   > BinaryCosineBenchmark.cosineDistanceNew     512  thrpt    5   20.769 ±  0.008  ops/us
   > BinaryCosineBenchmark.cosineDistanceNew     702  thrpt    5   13.757 ±  0.017  ops/us
   > BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   10.201 ±  0.008  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld       1  thrpt    5  128.889 ±  0.095  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld     128  thrpt    5   14.177 ±  0.064  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld     207  thrpt    5    8.967 ±  0.023  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld     256  thrpt    5    7.295 ±  0.014  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld     300  thrpt    5    6.215 ±  0.034  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld     512  thrpt    5    3.709 ±  0.003  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld     702  thrpt    5    2.714 ±  0.005  ops/us
   > BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5    1.872 ±  0.001  ops/us
   > BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   23.562 ±  0.025  ops/us
   > BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5    3.885 ±  0.082  ops/us
   > BinarySquareBenchmark.squareDistanceNew       1  thrpt    5  522.630 ±  4.081  ops/us
   > BinarySquareBenchmark.squareDistanceNew     128  thrpt    5  101.951 ±  4.427  ops/us
   > BinarySquareBenchmark.squareDistanceNew     207  thrpt    5   65.050 ±  0.254  ops/us
   > BinarySquareBenchmark.squareDistanceNew     256  thrpt    5   60.495 ±  0.922  ops/us
   > BinarySquareBenchmark.squareDistanceNew     300  thrpt    5   51.767 ±  0.042  ops/us
   > BinarySquareBenchmark.squareDistanceNew     512  thrpt    5   32.832 ±  0.022  ops/us
   > BinarySquareBenchmark.squareDistanceNew     702  thrpt    5   23.786 ±  0.018  ops/us
   > BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   17.062 ±  0.145  ops/us
   > BinarySquareBenchmark.squareDistanceOld       1  thrpt    5  529.882 ±  1.503  ops/us
   > BinarySquareBenchmark.squareDistanceOld     128  thrpt    5   32.478 ±  0.037  ops/us
   > BinarySquareBenchmark.squareDistanceOld     207  thrpt    5   20.901 ±  0.023  ops/us
   > BinarySquareBenchmark.squareDistanceOld     256  thrpt    5   16.644 ±  0.070  ops/us
   > BinarySquareBenchmark.squareDistanceOld     300  thrpt    5   14.502 ±  0.111  ops/us
   > BinarySquareBenchmark.squareDistanceOld     512  thrpt    5    8.703 ±  0.050  ops/us
   > BinarySquareBenchmark.squareDistanceOld     702  thrpt    5    6.473 ±  0.013  ops/us
   > BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5    4.454 ±  0.016  ops/us
   > FloatCosineBenchmark.cosineNew                1  thrpt    5  395.222 ±  3.364  ops/us
   > FloatCosineBenchmark.cosineNew                4  thrpt    5  275.572 ±  2.528  ops/us
   > FloatCosineBenchmark.cosineNew                6  thrpt    5  217.377 ±  0.561  ops/us
   > FloatCosineBenchmark.cosineNew                8  thrpt    5  192.311 ±  1.492  ops/us
   > FloatCosineBenchmark.cosineNew               13  thrpt    5  143.959 ±  0.061  ops/us
   > FloatCosineBenchmark.cosineNew               16  thrpt    5  127.340 ±  0.181  ops/us
   > FloatCosineBenchmark.cosineNew               25  thrpt    5  116.471 ±  0.219  ops/us
   > FloatCosineBenchmark.cosineNew               32  thrpt    5  117.458 ±  0.031  ops/us
   > FloatCosineBenchmark.cosineNew               64  thrpt    5  100.845 ±  0.016  ops/us
   > FloatCosineBenchmark.cosineNew              100  thrpt    5   73.392 ±  0.129  ops/us
   > FloatCosineBenchmark.cosineNew              128  thrpt    5   79.363 ±  0.012  ops/us
   > FloatCosineBenchmark.cosineNew              207  thrpt    5   48.170 ±  0.027  ops/us
   > FloatCosineBenchmark.cosineNew              256  thrpt    5   43.298 ±  0.103  ops/us
   > FloatCosineBenchmark.cosineNew              300  thrpt    5   36.302 ±  0.017  ops/us
   > FloatCosineBenchmark.cosineNew              512  thrpt    5   26.113 ±  0.076  ops/us
   > FloatCosineBenchmark.cosineNew              702  thrpt    5   19.034 ±  0.008  ops/us
   > FloatCosineBenchmark.cosineNew             1024  thrpt    5   16.945 ±  0.052  ops/us
   > FloatCosineBenchmark.cosineOld                1  thrpt    5  398.987 ±  0.675  ops/us
   > FloatCosineBenchmark.cosineOld                4  thrpt    5  279.282 ±  4.223  ops/us
   > FloatCosineBenchmark.cosineOld                6  thrpt    5  220.884 ±  5.144  ops/us
   > FloatCosineBenchmark.cosineOld                8  thrpt    5  196.722 ±  0.389  ops/us
   > FloatCosineBenchmark.cosineOld               13  thrpt    5  146.701 ±  0.832  ops/us
   > FloatCosineBenchmark.cosineOld               16  thrpt    5  130.186 ±  0.280  ops/us
   > FloatCosineBenchmark.cosineOld               25  thrpt    5   87.526 ±  0.083  ops/us
   > FloatCosineBenchmark.cosineOld               32  thrpt    5   70.398 ±  0.124  ops/us
   > FloatCosineBenchmark.cosineOld               64  thrpt    5   35.020 ±  0.007  ops/us
   > FloatCosineBenchmark.cosineOld              100  thrpt    5   21.121 ±  0.009  ops/us
   > FloatCosineBenchmark.cosineOld              128  thrpt    5   16.276 ±  0.008  ops/us
   > FloatCosineBenchmark.cosineOld              207  thrpt    5   10.017 ±  0.002  ops/us
   > FloatCosineBenchmark.cosineOld              256  thrpt    5    8.085 ±  0.002  ops/us
   > FloatCosineBenchmark.cosineOld              300  thrpt    5    6.882 ±  0.001  ops/us
   > FloatCosineBenchmark.cosineOld              512  thrpt    5    3.981 ±  0.009  ops/us
   > FloatCosineBenchmark.cosineOld              702  thrpt    5    2.900 ±  0.008  ops/us
   > FloatCosineBenchmark.cosineOld             1024  thrpt    5    1.990 ±  0.001  ops/us
   > FloatDotProductBenchmark.dotProductNew        1  thrpt    5  482.634 ±  0.308  ops/us
   > FloatDotProductBenchmark.dotProductNew        4  thrpt    5  358.350 ±  0.814  ops/us
   > FloatDotProductBenchmark.dotProductNew        6  thrpt    5  299.456 ±  9.216  ops/us
   > FloatDotProductBenchmark.dotProductNew        8  thrpt    5  282.228 ±  0.560  ops/us
   > FloatDotProductBenchmark.dotProductNew       13  thrpt    5  237.520 ±  0.758  ops/us
   > FloatDotProductBenchmark.dotProductNew       16  thrpt    5  226.653 ±  0.598  ops/us
   > FloatDotProductBenchmark.dotProductNew       25  thrpt    5  203.128 ±  0.136  ops/us
   > FloatDotProductBenchmark.dotProductNew       32  thrpt    5  234.430 ±  2.885  ops/us
   > FloatDotProductBenchmark.dotProductNew       64  thrpt    5  199.576 ±  1.049  ops/us
   > FloatDotProductBenchmark.dotProductNew      100  thrpt    5  143.859 ±  0.351  ops/us
   > FloatDotProductBenchmark.dotProductNew      128  thrpt    5  163.681 ±  1.253  ops/us
   > FloatDotProductBenchmark.dotProductNew      207  thrpt    5   98.173 ±  1.241  ops/us
   > FloatDotProductBenchmark.dotProductNew      256  thrpt    5   95.487 ±  0.053  ops/us
   > FloatDotProductBenchmark.dotProductNew      300  thrpt    5   64.907 ±  0.056  ops/us
   > FloatDotProductBenchmark.dotProductNew      512  thrpt    5   62.076 ±  0.192  ops/us
   > FloatDotProductBenchmark.dotProductNew      702  thrpt    5   33.659 ±  0.053  ops/us
   > FloatDotProductBenchmark.dotProductNew     1024  thrpt    5   29.892 ±  0.188  ops/us
   > FloatDotProductBenchmark.dotProductOld        1  thrpt    5  558.982 ± 14.484  ops/us
   > FloatDotProductBenchmark.dotProductOld        4  thrpt    5  425.068 ±  3.916  ops/us
   > FloatDotProductBenchmark.dotProductOld        6  thrpt    5  398.557 ±  1.348  ops/us
   > FloatDotProductBenchmark.dotProductOld        8  thrpt    5  359.623 ±  0.203  ops/us
   > FloatDotProductBenchmark.dotProductOld       13  thrpt    5  262.966 ±  0.098  ops/us
   > FloatDotProductBenchmark.dotProductOld       16  thrpt    5  229.867 ±  0.083  ops/us
   > FloatDotProductBenchmark.dotProductOld       25  thrpt    5  165.441 ±  0.115  ops/us
   > FloatDotProductBenchmark.dotProductOld       32  thrpt    5  152.221 ±  0.138  ops/us
   > FloatDotProductBenchmark.dotProductOld       64  thrpt    5   85.443 ±  0.270  ops/us
   > FloatDotProductBenchmark.dotProductOld      100  thrpt    5   53.636 ±  0.020  ops/us
   > FloatDotProductBenchmark.dotProductOld      128  thrpt    5   42.828 ±  0.023  ops/us
   > FloatDotProductBenchmark.dotProductOld      207  thrpt    5   26.981 ±  0.055  ops/us
   > FloatDotProductBenchmark.dotProductOld      256  thrpt    5   21.944 ±  0.157  ops/us
   > FloatDotProductBenchmark.dotProductOld      300  thrpt    5   18.856 ±  0.011  ops/us
   > FloatDotProductBenchmark.dotProductOld      512  thrpt    5   11.330 ±  0.011  ops/us
   > FloatDotProductBenchmark.dotProductOld      702  thrpt    5    8.150 ±  0.006  ops/us
   > FloatDotProductBenchmark.dotProductOld     1024  thrpt    5    5.814 ±  0.006  ops/us
   > FloatSquareBenchmark.squareNew                1  thrpt    5  479.897 ±  0.414  ops/us
   > FloatSquareBenchmark.squareNew                4  thrpt    5  347.548 ±  5.824  ops/us
   > FloatSquareBenchmark.squareNew                6  thrpt    5  320.104 ±  3.070  ops/us
   > FloatSquareBenchmark.squareNew                8  thrpt    5  272.376 ±  2.516  ops/us
   > FloatSquareBenchmark.squareNew               13  thrpt    5  236.600 ±  1.357  ops/us
   > FloatSquareBenchmark.squareNew               16  thrpt    5  225.289 ±  0.361  ops/us
   > FloatSquareBenchmark.squareNew               25  thrpt    5  201.074 ±  0.363  ops/us
   > FloatSquareBenchmark.squareNew               32  thrpt    5  222.044 ±  1.173  ops/us
   > FloatSquareBenchmark.squareNew               64  thrpt    5  192.298 ±  2.776  ops/us
   > FloatSquareBenchmark.squareNew              100  thrpt    5  131.676 ±  0.082  ops/us
   > FloatSquareBenchmark.squareNew              128  thrpt    5  144.401 ±  1.032  ops/us
   > FloatSquareBenchmark.squareNew              207  thrpt    5   85.532 ±  0.490  ops/us
   > FloatSquareBenchmark.squareNew              256  thrpt    5   79.800 ±  0.023  ops/us
   > FloatSquareBenchmark.squareNew              300  thrpt    5   67.323 ±  0.287  ops/us
   > FloatSquareBenchmark.squareNew              512  thrpt    5   43.781 ±  0.029  ops/us
   > FloatSquareBenchmark.squareNew              702  thrpt    5   27.687 ±  0.008  ops/us
   > FloatSquareBenchmark.squareNew             1024  thrpt    5   20.829 ±  0.131  ops/us
   > FloatSquareBenchmark.squareOld                1  thrpt    5  479.053 ±  0.635  ops/us
   > FloatSquareBenchmark.squareOld                4  thrpt    5  345.476 ±  3.422  ops/us
   > FloatSquareBenchmark.squareOld                6  thrpt    5  320.319 ±  0.476  ops/us
   > FloatSquareBenchmark.squareOld                8  thrpt    5  347.901 ±  0.762  ops/us
   > FloatSquareBenchmark.squareOld               13  thrpt    5  223.134 ±  0.950  ops/us
   > FloatSquareBenchmark.squareOld               16  thrpt    5  213.299 ±  0.439  ops/us
   > FloatSquareBenchmark.squareOld               25  thrpt    5  141.332 ±  1.297  ops/us
   > FloatSquareBenchmark.squareOld               32  thrpt    5  117.027 ±  0.792  ops/us
   > FloatSquareBenchmark.squareOld               64  thrpt    5   63.886 ±  0.034  ops/us
   > FloatSquareBenchmark.squareOld              100  thrpt    5   42.088 ±  0.016  ops/us
   > FloatSquareBenchmark.squareOld              128  thrpt    5   33.163 ±  0.007  ops/us
   > FloatSquareBenchmark.squareOld              207  thrpt    5   19.855 ±  0.098  ops/us
   > FloatSquareBenchmark.squareOld              256  thrpt    5   16.870 ±  0.043  ops/us
   > FloatSquareBenchmark.squareOld              300  thrpt    5   14.326 ±  0.973  ops/us
   > FloatSquareBenchmark.squareOld              512  thrpt    5    8.837 ±  0.047  ops/us
   > FloatSquareBenchmark.squareOld              702  thrpt    5    6.299 ±  0.042  ops/us
   > FloatSquareBenchmark.squareOld             1024  thrpt    5    4.351 ±  0.062  ops/us
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/lucene/pull/12311#issuecomment-1574989612>, or
   > unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AADNSFEV6IGKYOGAFT5AOADXJNDMDANCNFSM6AAAAAAYGM6GLY>
   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   -- 
   - MRM
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555062236

   > In addition at some point we should rename the files, but thats not urgent because naming is not so important. We should then also rename the extraction gradle script, as it will be used not only for panama-foreign. But this is cosmetic only.
   
   Agreed. I had a similar thought when starting out on this, but avoided it then because of the "noise" - but it's certainly worth doing now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198137116


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##########
@@ -270,4 +216,134 @@ public static float dotProductScore(byte[] a, byte[] b) {
     float denom = (float) (a.length * (1 << 15));
     return 0.5f + dotProduct(a, b) / denom;
   }
+
+  interface VectorUtilProvider {
+
+    // just dot product for now
+    float dotProduct(float[] a, float[] b);
+  }
+
+  private static VectorUtilProvider lookupProvider() {
+    // TODO: add a check
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) { // TODO: do we want JDK 19?
+      try {
+        ensureReadability();

Review Comment:
   I think, if the module is not readable we should fallback to the default provider. This code would bail out later with linkage error, why not `return new LuceneVectorUtilProvider()` in case the module is not readable (of course log exception).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198703579


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -122,7 +122,7 @@ allprojects {
       
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
-      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
+      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management,jdk.incubator.vector'

Review Comment:
   We're going to need a way to test both the default and the VectorAPI implementations. Currently the code just checks for the presence of the jdk.incubator.vector module. Either we add some additional configuration, like say a system prop to forcibly disable the VectorAPI implementation, or else we could have some gradle-foo that controls the addition of the module in `--add-modules` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199003878


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,13 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
+
+      // Disable assertions to workaround JDK-8301190
+      jvmArgs '-da:jdk.incubator.vector.LaneType'

Review Comment:
   Forbiddenapis should be used inside openjdk. Suggested this several times. Is always ignored, although it works also with Ant (used in openjdk for compiling).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560167503

   my concern is that if i spin up a QEMU that doesn't pass thru AVX flags in CPUID, that the same thing will happen without any java VM flags.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1204236308


##########
gradle/java/memorysegment-mrjar.gradle:
##########
@@ -40,6 +40,7 @@ configure(project(":lucene:core")) {
           "-Xlint:-options",
           "--patch-module", "java.base=${apijar}",
           "--add-exports", "java.base/java.lang.foreign=ALL-UNNAMED",
+          "--add-exports", "java.base/jdk.incubator.vector=ALL-UNNAMED", // This is a hack, but does it matter

Review Comment:
   I think we can live with this. If so, then we should have a more suitable comment. 
   
   `  // just patch into java.base since that is already resolved and present` 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559223951

   I also ran the benchmark on a Skylake Linux box I have:
   
   ```
   processor       : 7
   vendor_id       : GenuineIntel
   cpu family      : 6
   model           : 94
   model name      : Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
   stepping        : 3
   microcode       : 0xe2
   cpu MHz         : 800.156
   cache size      : 8192 KB
   physical id     : 0
   siblings        : 8
   core id         : 3
   cpu cores       : 4
   apicid          : 7
   initial apicid  : 7
   fpu             : yes
   fpu_exception   : yes
   cpuid level     : 22
   wp              : yes
   flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single intel_pt ssbd ibrs ibpb stibp kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
   bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit srbds
   bogomips        : 8015.84
   clflush size    : 64
   cache_alignment : 64
   address sizes   : 39 bits physical, 48 bits virtual
   power management:
   ```
   
   
   ```
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   5.303 ± 0.388  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   1.169 ± 0.026  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  10.531 ± 0.004  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   2.666 ± 0.501  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   9.390 ± 0.319  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   2.274 ± 0.088  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   8.870 ± 0.106  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   0.922 ± 0.001  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  18.078 ± 0.399  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   2.953 ± 0.131  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  13.459 ± 2.731  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   2.103 ± 0.014  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561686793

   Hi, I would like to cleanup the filenames a bit (I checked the other PR, it should be easy to merge aferwards with some manual work).
   
   My proposal:
   - rename `gradle/generate/panama-foreign.gradle` to `gradle/generate/extract-jdk-apis.gradle`
   - rename extractor java file to `ExtractJdkApis.java`
   - rename `gradle/java/memorysegment-mrjar.gradle` to `gradle/java/core-mrjar.gradle` (because it is for lucene core only)
   - rename `lucene/core/src/generated/jdk/panama-foreign-jdkXX.apijar` to `lucene/core/src/generated/jdk/jdkXX.apijar`
   
   Any better ideas?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556297355

   aarch64: 
   ```
   Benchmark                             (size)   Mode  Cnt    Score   Error   Units
   DotProductBenchmark.dotProductNew          1  thrpt    5  322.255 ± 0.496  ops/us
   DotProductBenchmark.dotProductNew          4  thrpt    5  247.637 ± 1.027  ops/us
   DotProductBenchmark.dotProductNew          6  thrpt    5  222.331 ± 0.156  ops/us
   DotProductBenchmark.dotProductNew          8  thrpt    5  184.218 ± 0.074  ops/us
   DotProductBenchmark.dotProductNew         13  thrpt    5  128.962 ± 0.017  ops/us
   DotProductBenchmark.dotProductNew         16  thrpt    5  165.321 ± 0.057  ops/us
   DotProductBenchmark.dotProductNew         25  thrpt    5  132.347 ± 0.435  ops/us
   DotProductBenchmark.dotProductNew         32  thrpt    5  145.280 ± 5.810  ops/us
   DotProductBenchmark.dotProductNew         64  thrpt    5  111.581 ± 9.717  ops/us
   DotProductBenchmark.dotProductNew        100  thrpt    5   81.562 ± 0.310  ops/us
   DotProductBenchmark.dotProductNew        128  thrpt    5   80.488 ± 0.255  ops/us
   DotProductBenchmark.dotProductNew        207  thrpt    5   41.804 ± 0.360  ops/us
   DotProductBenchmark.dotProductNew        256  thrpt    5   40.422 ± 0.090  ops/us
   DotProductBenchmark.dotProductNew        300  thrpt    5   33.164 ± 0.131  ops/us
   DotProductBenchmark.dotProductNew        512  thrpt    5   21.133 ± 0.020  ops/us
   DotProductBenchmark.dotProductNew        702  thrpt    5   13.383 ± 0.010  ops/us
   DotProductBenchmark.dotProductNew       1024  thrpt    5    9.402 ± 0.198  ops/us
   DotProductBenchmark.dotProductNewNew       1  thrpt    5  322.162 ± 1.042  ops/us
   DotProductBenchmark.dotProductNewNew       4  thrpt    5  247.393 ± 3.673  ops/us
   DotProductBenchmark.dotProductNewNew       6  thrpt    5  219.753 ± 2.000  ops/us
   DotProductBenchmark.dotProductNewNew       8  thrpt    5  189.224 ± 3.233  ops/us
   DotProductBenchmark.dotProductNewNew      13  thrpt    5  148.712 ± 6.845  ops/us
   DotProductBenchmark.dotProductNewNew      16  thrpt    5  169.608 ± 0.200  ops/us
   DotProductBenchmark.dotProductNewNew      25  thrpt    5  105.866 ± 0.506  ops/us
   DotProductBenchmark.dotProductNewNew      32  thrpt    5  146.394 ± 0.802  ops/us
   DotProductBenchmark.dotProductNewNew      64  thrpt    5  119.317 ± 0.385  ops/us
   DotProductBenchmark.dotProductNewNew     100  thrpt    5   84.921 ± 2.819  ops/us
   DotProductBenchmark.dotProductNewNew     128  thrpt    5   87.055 ± 0.473  ops/us
   DotProductBenchmark.dotProductNewNew     207  thrpt    5   51.933 ± 0.270  ops/us
   DotProductBenchmark.dotProductNewNew     256  thrpt    5   55.509 ± 3.926  ops/us
   DotProductBenchmark.dotProductNewNew     300  thrpt    5   26.955 ± 0.016  ops/us
   DotProductBenchmark.dotProductNewNew     512  thrpt    5   20.205 ± 0.076  ops/us
   DotProductBenchmark.dotProductNewNew     702  thrpt    5   20.980 ± 0.029  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5   16.244 ± 0.172  ops/us
   DotProductBenchmark.dotProductOld          1  thrpt    5  347.324 ± 5.857  ops/us
   DotProductBenchmark.dotProductOld          4  thrpt    5  247.653 ± 0.285  ops/us
   DotProductBenchmark.dotProductOld          6  thrpt    5  262.647 ± 1.425  ops/us
   DotProductBenchmark.dotProductOld          8  thrpt    5  227.780 ± 0.477  ops/us
   DotProductBenchmark.dotProductOld         13  thrpt    5  154.562 ± 2.007  ops/us
   DotProductBenchmark.dotProductOld         16  thrpt    5  154.300 ± 0.460  ops/us
   DotProductBenchmark.dotProductOld         25  thrpt    5   92.981 ± 0.083  ops/us
   DotProductBenchmark.dotProductOld         32  thrpt    5   89.480 ± 0.211  ops/us
   DotProductBenchmark.dotProductOld         64  thrpt    5   50.378 ± 0.064  ops/us
   DotProductBenchmark.dotProductOld        100  thrpt    5   38.107 ± 0.115  ops/us
   DotProductBenchmark.dotProductOld        128  thrpt    5   27.906 ± 0.030  ops/us
   DotProductBenchmark.dotProductOld        207  thrpt    5   21.094 ± 0.016  ops/us
   DotProductBenchmark.dotProductOld        256  thrpt    5   14.728 ± 0.004  ops/us
   DotProductBenchmark.dotProductOld        300  thrpt    5   14.860 ± 0.025  ops/us
   DotProductBenchmark.dotProductOld        512  thrpt    5    7.496 ± 0.041  ops/us
   DotProductBenchmark.dotProductOld        702  thrpt    5    6.369 ± 0.015  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5    3.826 ± 0.002  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560207056

   This isn't a theoretical issue, I really think its a openjdk problem?
   
   To reproduce it, i simply emulate a Nehalem cpu (the one right before AVX):
   ```
   processor	: 3
   vendor_id	: GenuineIntel
   cpu family	: 6
   model		: 26
   model name	: Intel Core i7 9xx (Nehalem Class Core i7)
   stepping	: 3
   microcode	: 0x1
   cpu MHz		: 2496.002
   cache size	: 16384 KB
   physical id	: 0
   siblings	: 4
   core id		: 1
   cpu cores	: 2
   apicid		: 3
   initial apicid	: 3
   fpu		: yes
   fpu_exception	: yes
   cpuid level	: 11
   wp		: yes
   flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt hypervisor lahf_lm cpuid_fault
   bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
   bogomips	: 4994.00
   clflush size	: 64
   cache_alignment	: 64
   address sizes	: 40 bits physical, 48 bits virtual
   power management:
   ```
   
   Here's my QEMU command:
   ```
   qemu-system-x86_64 \
     -name "Arch Linux" \
     -machine q35,vmport=off \
     -accel kvm,kernel-irqchip=on \
     -cpu Nehalem-v1 \
     -smp 4,cores=2,threads=2 \
     -m 8192 \
     -boot menu=on \
     -nodefaults \
     -no-user-config \
     -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 \
     -nic user,ipv6=off,model=virtio-net-pci,hostfwd=tcp::10022-:22 \
     -device virtio-scsi-pci,id=scsi0 \
     -device scsi-hd,bus=scsi0.0,drive=drive-scsi0 \
     -drive file=disk.qcow2,if=none,id=drive-scsi0,discard=on \
     -nographic \
     -vga none \
     -device virtio-serial-pci \
     -device virtserialport \
     -device virtconsole \
     -serial mon:stdio \
     -parallel none
   ```
   
   Here are benchmark results. It behaves the same as passing `-XX:UseAVX=0` on the host machine.
   ```
   Benchmark                                (size)   Mode  Cnt  Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5  0.010 ± 0.003  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5  0.776 ± 0.107  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  0.023 ± 0.006  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5  1.861 ± 0.087  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5  0.025 ± 0.001  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5  1.555 ± 0.048  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5  3.416 ± 0.256  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5  0.694 ± 0.117  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  8.968 ± 0.125  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5  1.835 ± 0.162  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  6.665 ± 0.068  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5  1.224 ± 0.151  ops/us
   ```
   
   ```
   dev0:vectorbench[main]$ /usr/lib/jvm/java-20-openjdk/bin/jshell --add-modules jdk.incubator.vector
   May 23, 2023 6:24:14 PM java.util.prefs.FileSystemPreferences$1 run
   INFO: Created user preferences directory.
   |  Welcome to JShell -- Version 20.0.1
   |  For an introduction type: /help intro
   
   jshell> jdk.incubator.vector.IntVector.SPECIES_PREFERRED
   $1 ==> Species[int, 4, S_128_BIT]
   ```
   
   128-bit vectors are claimed but don't work. This is not good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557840103

   @rmuir Testing with the latest vector bench, commit 8f25834, I see:
   
   Linux - AVX 512
   ```
   Benchmark                                   (size)   Mode  Cnt   Score   Error   Units
   BinaryDotProductBenchmark.dotProductNew       1024  thrpt    5  16.467 ± 0.033  ops/us
   BinaryDotProductBenchmark.dotProductNewNew    1024  thrpt    5  20.941 ± 0.020  ops/us
   BinaryDotProductBenchmark.dotProductOld       1024  thrpt    5   3.251 ± 0.060  ops/us
   ```
   
   And hardcoding a variant of ByteVector.SPECIES_128 -> ShortVector.SPECIES_256 -> IntVector.SPECIES_512, gives around ~20 ops/us on my machine, so this more general for >=256 is very good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559578873

   @uschindler 
   
   > Cool thanks. I was about to do the same, my idea was a bit different: Add a normal virtual method "isSupported()" to the interface and implement it returning true for default provider but returning something depending on vector size for panama provider. This would spare two times doing the additional reflection using the lookup. The lookup function would only return the panama provider if it returns true.
   > 
   > Another approach is to use Lookup#findStaticVarHandle() on `INT_SPECIES_PREF_BIT_SIZE` and read it. This spares catching `Throwable`, which is one of the things I hate about method handles.
   
   I used a static method to avoid creating the instance if it's never going to be used, feel free to rewrite this. I don't have a strong option on how this is done - just that it is done. :-) 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559571108

   > > you can see this stuff in benchmarks above where e.g. 128 dims is faster than 100, etc.
   > 
   > should we zero-pad before computing the dot-products? It wouldn't affect the result and sounds like it would be faster
   
   But only in our panama impl?!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560935868

   Ok so the jdk 20 is out. Back to the status quo of any vectors code I write sitting on the shelf for years. 
   
   I stated in the issue we might have to do some ugly stuff for this to work. But I see your point of view.
   
   Ping me when we can start making progress again.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558282897

   ok last function done (FloatCosineBenchmark). again no surprises here:
   
   skylake:
   ```
   Benchmark                       (size)   Mode  Cnt    Score    Error   Units
   FloatCosineBenchmark.cosineNew    1024  thrpt    5    6.136 ±  0.023  ops/us
   FloatCosineBenchmark.cosineOld    1024  thrpt    5    0.640 ±  0.005  ops/us
   ```
   
   m1:
   ```
   Benchmark                       (size)   Mode  Cnt    Score   Error   Units
   FloatCosineBenchmark.cosineNew    1024  thrpt    5    7.787 ± 0.007  ops/us
   FloatCosineBenchmark.cosineOld    1024  thrpt    5    1.076 ± 0.003  ops/us
   ```
   
   This is the last of the 6 functions, I will get them pushed so the VectorUtilProvider is complete, for all the similarity functions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558725059

   Apologies, I'm don't have much time early this week, but hope to be more active later in the week. I know that there are still outstanding questions around testing, luceneutil, etc, but I think that Robert's benchmark has been invaluable and laid solid foundations here, so I fully expect that we'll be able to work through the other outstanding issues.  Overall the code is really coming together. Noice!  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201916237


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   Yes, of course. What would the expected the delta be for such?  ( it would only make sense if it is quite small, right? )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560000194

   Assert is fine. Now that the check is in ctor we know for sure it will be at least 128.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560978321

   Hi,
   
   > I started a conversation on panama-dev, initially starting with the shape of the preferred species, when combined with various command line flags. I deliberately keep this focused on `UseAVX=0`, I would like to get an understand on that before discussing C2, etc.
   > 
   > https://mail.openjdk.org/pipermail/panama-dev/2023-May/019072.html
   
   C2 is in my opinion much more important here as the SPECIES_PREFERRED are plain wrong when C2 is disabled. They look like some default values.
   
   I also think that AVX=0 is another bug than species.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561482479

   > Should we now start to prepare this for inclusion in main and 9.x?
   
   I love that we've gotten to a point where this question has come up. Excellent progress!
   
   >If we are fine with the current approach, we could merge this to main 9.x and then proceed to move more stuff into VectorUtil.
   
   I'm fine with this. Of course others closer to the project will have a better sense, but I fully agree that the current approach is amenable to be refactored in the future if needed. Which lessens any concerns relating to future potential usage of SIMD in Lucene.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1584875630

   Ah I see the other comment. We would also need to calculate a dot product. We could do that at end of the provider for Jdk20 and possibly for 21.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1554738661

   I noticed that the apijar files are quite large, because the extraction code can't remove package private superclasses. Therefore all package private classes stay alive as "empty" fragments.
   For panama-foreign this was not an issue, as in java.base the rules for separating implementation from api is very strict. I will update the code to do a 2pass scan: first collect all referenced classes (superclasses and interfaces) and then extract APIs.
   I will commit directly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1554743815

   In addition at some point we should rename the files, but thats not urgent because naming is not so important. We should then also rename the extraction gradle script, as it will be used not only for panama-foreign. But this is cosmetic only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198085290


##########
gradle/generation/panama-foreign.gradle:
##########
@@ -45,13 +45,14 @@ configure(project(":lucene:core")) {
           javaLauncher.get()
           return true
         } catch (Exception e) {
-          logger.warn('Launcher for Java {} is not available; skipping regeneration of Panama Foreign API JAR.', jdkVersion)
+          logger.warn('Launcher for Java {} is not available; skipping regeneration of Panama Foreign & Vector API JAR.', jdkVersion)
           logger.warn('Error: {}', e.cause?.message)
           logger.warn("Please make sure to point env 'JAVA{}_HOME' to exactly JDK version {} or enable Gradle toolchain auto-download.", jdkVersion, jdkVersion)
           return false
         }
       }
-      
+
+      jvmArgs = ["--add-modules=jdk.incubator.vector"]

Review Comment:
   This should not be needed, as the extractor reads the jrt filesystem directly!? It should run without this flag!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201067808


##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    final int vectorSize = IntVector.SPECIES_PREFERRED.vectorBitSize();

Review Comment:
   this could also be a constant outside the method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558372829

   Here's a summary of where the perf sits for these various functions on my machines.
   
   It only takes 5 minutes to run a pass just for vector size of 1024 dimensions only to get an idea:
   ```
   $ git clone https://github.com/rmuir/vectorbench.git
   $ cd vectorbench
   $ mvn verify
   $ java -jar target/vectorbench.jar -p size=1024
   ```
   
   intel skylake (256-bit vectors):
   ```
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   3.626 ± 0.045  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   0.790 ± 0.086  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   7.122 ± 0.329  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   1.835 ± 0.039  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   6.392 ± 0.057  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   1.545 ± 0.247  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   6.074 ± 0.089  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   0.631 ± 0.006  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  12.108 ± 0.152  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   2.014 ± 0.020  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5   9.504 ± 0.171  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   1.412 ± 0.028  ops/us
   ```
   
   mac m1 arm (128-bit vectors):
   ```
   Benchmark                                (size)   Mode  Cnt   Score    Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   2.259 ±  0.011  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   1.046 ±  0.002  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   6.142 ±  0.002  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   3.107 ±  0.002  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   6.142 ±  0.002  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   3.100 ±  0.016  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   7.840 ±  0.006  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   1.076 ±  0.001  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  12.467 ±  0.005  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   3.823 ±  0.001  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  14.329 ±  0.061  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   3.185 ±  0.002  ops/us
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560766494

   > So please lets stop here and not implement hardware detection for opt-in only preview images!
   
   Sorry, I strongly disagree here. We should not merge this feature when it gives such horrible performance. Look at my numbers. It is insanely bad. We are already dealing with a very trappy lucene feature, poor users, I don't want to make matters worse with a very trappy openjdk feature.
   
   OpenJDK made a wrong design for the whole vector api to have such slow fallback code at all. That's the root of the problem: it is a design flaw. they should throw exception instead.
   
   Separately there's a bug for x86 in the JDK where AVX feature isn't enabled, its doing the wrong thing. That is bug #2.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560977330

   > to be clear, i'm not sure its an issue of SPECIES_PREFERRED. As you can see, 128-bit floating point works perfectly with `-XX:UseAVX=0`. I assume using SSE etc :)
   > 
   > But the integer math does not. The conversion instructions etc needed should all be there (sse4.1 is available): https://www.felixcloutier.com/x86/pmovsx . Maybe intrinsics are missing for the integer side.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1563461446

   that's a helpful insight, yes I think we're really only doing a fairly quick run each time through - a handful of seconds. I'll try increasing that
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564310637

   Hi @alessandrobenedetti, the code shown here is indeed crazy to read, but this is more a problem of the APIs in general. The Java Vector API is very low level and you have to exactly know how lanes, species and so on work. The code written by Robert is 100% according to the javadocs guidelines.
   A lot of low level code (in codecs like BlockTermsReader is done like that). Also MMapDirectory indexinput look like that. They are not beatiful but optimized for performance. To me the variable names are perfectly fine vor vector code (`ab`, `a1b`,...). This is typical in that area. It won't get better with other names.
   
   The code on official JDK docs looks identical: https://docs.oracle.com/en/java/javase/20/docs/api/jdk.incubator.vector/jdk/incubator/vector/package-summary.html
   
   The arbitrary if/else constructs are a problem of underlying hardware infratstructure. It is NOT autogenerated, but follows low-level hardware specs, so there are arbitrary looking constants and if/else in it. This can be improved by moving numbers like 128 as constants, be free to make PRs! For performance reasons you should NOT split that up into too many different methods, as the code relies on escape analysis of the VM. We may split it later, but that's more a cleanup approach.
   
   An additional problem in the whole code is that it is Java version specific, so there will me multiple versions of the same code staying in different directories (java20, java21,...). Same for MMapDirectory.
   
   The extraction code for the Java APIs is special and a hack, but it is not part of Lucene's public library; it is a build tool only. Sorry for it, there's a new version now because a rewrite was needed to allow backporting and fix incomplete extraction: #12329 (this version looks much better, also technically bettrer organized).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559834404

   I tested this with changing if statement to 1280 :-) Message looks fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557870684

   thanks for benchmarking! I will merge it into the branch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557693789

   @ChrisHegarty i pushed a commit, stealing some of your ideas there, to generalized the 256-bit algo to also (in theory) work with avx512. 
   
   It causes no regression on my avx-256, I am curious if it does ok on your avx-512: https://github.com/rmuir/vectorbench/blob/2c29b865460229cac01f2a5315d9b9502e20ef2b/src/main/java/testing/BinaryDotProductBenchmark.java#L49-L59


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557642297

   @ChrisHegarty See what I mean around correctness? This is what will happen for machine with only 64-bit vectors.
   ```
   jshell> jdk.incubator.vector.VectorShape.forBitSize(32)
   |  Exception java.lang.IllegalArgumentException: Bad vector bit-size: 32
   |        at VectorShape.forBitSize (VectorShape.java:141)
   |        at (#1:1)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560162059

   if you want to see what i mean, try running `java -XX:UseAVX=0 -jar target/vectorbench.jar -p size=1024`:
   ```
   Benchmark                                (size)   Mode  Cnt  Score    Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5  0.011 ±  0.001  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5  0.799 ±  0.017  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  0.028 ±  0.003  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5  1.798 ±  0.294  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5  0.027 ±  0.002  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5  1.540 ±  0.109  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5  3.153 ±  0.401  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5  0.664 ±  0.141  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  7.443 ±  0.315  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5  1.689 ±  0.220  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  5.439 ±  0.275  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5  1.155 ±  0.293  ops/us
   ```
   
   How to detect such incredibly trappy performance and prevent it? `IntVector.SPECIES_PREFERRED` still says 128 in this case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1574989612

   I ran @rmuir's vectorbench on a new Raptor Lake build (i9-13900K).
   
   Note that this CPU does NOT seem to support AVX-512:
   
   ```
   flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi umip pku ospke waitpkg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize pconfig arch_lbr ibt flush_l1d arch_capabilities
   vmx flags       : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs ept_mode_based_exec tsc_scaling usr_wait_pause
   ```
   
   Benchy results:
   
   ```
   Benchmark                                (size)   Mode  Cnt    Score    Error   Units
   BinaryCosineBenchmark.cosineDistanceNew       1  thrpt    5  128.894 ±  0.021  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     128  thrpt    5   64.367 ±  0.062  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     207  thrpt    5   37.563 ±  0.030  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     256  thrpt    5   36.636 ±  0.004  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     300  thrpt    5   31.305 ±  0.012  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     512  thrpt    5   20.769 ±  0.008  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     702  thrpt    5   13.757 ±  0.017  ops/us
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   10.201 ±  0.008  ops/us
   BinaryCosineBenchmark.cosineDistanceOld       1  thrpt    5  128.889 ±  0.095  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     128  thrpt    5   14.177 ±  0.064  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     207  thrpt    5    8.967 ±  0.023  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     256  thrpt    5    7.295 ±  0.014  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     300  thrpt    5    6.215 ±  0.034  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     512  thrpt    5    3.709 ±  0.003  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     702  thrpt    5    2.714 ±  0.005  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5    1.872 ±  0.001  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   23.562 ±  0.025  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5    3.885 ±  0.082  ops/us
   BinarySquareBenchmark.squareDistanceNew       1  thrpt    5  522.630 ±  4.081  ops/us
   BinarySquareBenchmark.squareDistanceNew     128  thrpt    5  101.951 ±  4.427  ops/us
   BinarySquareBenchmark.squareDistanceNew     207  thrpt    5   65.050 ±  0.254  ops/us
   BinarySquareBenchmark.squareDistanceNew     256  thrpt    5   60.495 ±  0.922  ops/us
   BinarySquareBenchmark.squareDistanceNew     300  thrpt    5   51.767 ±  0.042  ops/us
   BinarySquareBenchmark.squareDistanceNew     512  thrpt    5   32.832 ±  0.022  ops/us
   BinarySquareBenchmark.squareDistanceNew     702  thrpt    5   23.786 ±  0.018  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   17.062 ±  0.145  ops/us
   BinarySquareBenchmark.squareDistanceOld       1  thrpt    5  529.882 ±  1.503  ops/us
   BinarySquareBenchmark.squareDistanceOld     128  thrpt    5   32.478 ±  0.037  ops/us
   BinarySquareBenchmark.squareDistanceOld     207  thrpt    5   20.901 ±  0.023  ops/us
   BinarySquareBenchmark.squareDistanceOld     256  thrpt    5   16.644 ±  0.070  ops/us
   BinarySquareBenchmark.squareDistanceOld     300  thrpt    5   14.502 ±  0.111  ops/us
   BinarySquareBenchmark.squareDistanceOld     512  thrpt    5    8.703 ±  0.050  ops/us
   BinarySquareBenchmark.squareDistanceOld     702  thrpt    5    6.473 ±  0.013  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5    4.454 ±  0.016  ops/us
   FloatCosineBenchmark.cosineNew                1  thrpt    5  395.222 ±  3.364  ops/us
   FloatCosineBenchmark.cosineNew                4  thrpt    5  275.572 ±  2.528  ops/us
   FloatCosineBenchmark.cosineNew                6  thrpt    5  217.377 ±  0.561  ops/us
   FloatCosineBenchmark.cosineNew                8  thrpt    5  192.311 ±  1.492  ops/us
   FloatCosineBenchmark.cosineNew               13  thrpt    5  143.959 ±  0.061  ops/us
   FloatCosineBenchmark.cosineNew               16  thrpt    5  127.340 ±  0.181  ops/us
   FloatCosineBenchmark.cosineNew               25  thrpt    5  116.471 ±  0.219  ops/us
   FloatCosineBenchmark.cosineNew               32  thrpt    5  117.458 ±  0.031  ops/us
   FloatCosineBenchmark.cosineNew               64  thrpt    5  100.845 ±  0.016  ops/us
   FloatCosineBenchmark.cosineNew              100  thrpt    5   73.392 ±  0.129  ops/us
   FloatCosineBenchmark.cosineNew              128  thrpt    5   79.363 ±  0.012  ops/us
   FloatCosineBenchmark.cosineNew              207  thrpt    5   48.170 ±  0.027  ops/us
   FloatCosineBenchmark.cosineNew              256  thrpt    5   43.298 ±  0.103  ops/us
   FloatCosineBenchmark.cosineNew              300  thrpt    5   36.302 ±  0.017  ops/us
   FloatCosineBenchmark.cosineNew              512  thrpt    5   26.113 ±  0.076  ops/us
   FloatCosineBenchmark.cosineNew              702  thrpt    5   19.034 ±  0.008  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   16.945 ±  0.052  ops/us
   FloatCosineBenchmark.cosineOld                1  thrpt    5  398.987 ±  0.675  ops/us
   FloatCosineBenchmark.cosineOld                4  thrpt    5  279.282 ±  4.223  ops/us
   FloatCosineBenchmark.cosineOld                6  thrpt    5  220.884 ±  5.144  ops/us
   FloatCosineBenchmark.cosineOld                8  thrpt    5  196.722 ±  0.389  ops/us
   FloatCosineBenchmark.cosineOld               13  thrpt    5  146.701 ±  0.832  ops/us
   FloatCosineBenchmark.cosineOld               16  thrpt    5  130.186 ±  0.280  ops/us
   FloatCosineBenchmark.cosineOld               25  thrpt    5   87.526 ±  0.083  ops/us
   FloatCosineBenchmark.cosineOld               32  thrpt    5   70.398 ±  0.124  ops/us
   FloatCosineBenchmark.cosineOld               64  thrpt    5   35.020 ±  0.007  ops/us
   FloatCosineBenchmark.cosineOld              100  thrpt    5   21.121 ±  0.009  ops/us
   FloatCosineBenchmark.cosineOld              128  thrpt    5   16.276 ±  0.008  ops/us
   FloatCosineBenchmark.cosineOld              207  thrpt    5   10.017 ±  0.002  ops/us
   FloatCosineBenchmark.cosineOld              256  thrpt    5    8.085 ±  0.002  ops/us
   FloatCosineBenchmark.cosineOld              300  thrpt    5    6.882 ±  0.001  ops/us
   FloatCosineBenchmark.cosineOld              512  thrpt    5    3.981 ±  0.009  ops/us
   FloatCosineBenchmark.cosineOld              702  thrpt    5    2.900 ±  0.008  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5    1.990 ±  0.001  ops/us
   FloatDotProductBenchmark.dotProductNew        1  thrpt    5  482.634 ±  0.308  ops/us
   FloatDotProductBenchmark.dotProductNew        4  thrpt    5  358.350 ±  0.814  ops/us
   FloatDotProductBenchmark.dotProductNew        6  thrpt    5  299.456 ±  9.216  ops/us
   FloatDotProductBenchmark.dotProductNew        8  thrpt    5  282.228 ±  0.560  ops/us
   FloatDotProductBenchmark.dotProductNew       13  thrpt    5  237.520 ±  0.758  ops/us
   FloatDotProductBenchmark.dotProductNew       16  thrpt    5  226.653 ±  0.598  ops/us
   FloatDotProductBenchmark.dotProductNew       25  thrpt    5  203.128 ±  0.136  ops/us
   FloatDotProductBenchmark.dotProductNew       32  thrpt    5  234.430 ±  2.885  ops/us
   FloatDotProductBenchmark.dotProductNew       64  thrpt    5  199.576 ±  1.049  ops/us
   FloatDotProductBenchmark.dotProductNew      100  thrpt    5  143.859 ±  0.351  ops/us
   FloatDotProductBenchmark.dotProductNew      128  thrpt    5  163.681 ±  1.253  ops/us
   FloatDotProductBenchmark.dotProductNew      207  thrpt    5   98.173 ±  1.241  ops/us
   FloatDotProductBenchmark.dotProductNew      256  thrpt    5   95.487 ±  0.053  ops/us
   FloatDotProductBenchmark.dotProductNew      300  thrpt    5   64.907 ±  0.056  ops/us
   FloatDotProductBenchmark.dotProductNew      512  thrpt    5   62.076 ±  0.192  ops/us
   FloatDotProductBenchmark.dotProductNew      702  thrpt    5   33.659 ±  0.053  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5   29.892 ±  0.188  ops/us
   FloatDotProductBenchmark.dotProductOld        1  thrpt    5  558.982 ± 14.484  ops/us
   FloatDotProductBenchmark.dotProductOld        4  thrpt    5  425.068 ±  3.916  ops/us
   FloatDotProductBenchmark.dotProductOld        6  thrpt    5  398.557 ±  1.348  ops/us
   FloatDotProductBenchmark.dotProductOld        8  thrpt    5  359.623 ±  0.203  ops/us
   FloatDotProductBenchmark.dotProductOld       13  thrpt    5  262.966 ±  0.098  ops/us
   FloatDotProductBenchmark.dotProductOld       16  thrpt    5  229.867 ±  0.083  ops/us
   FloatDotProductBenchmark.dotProductOld       25  thrpt    5  165.441 ±  0.115  ops/us
   FloatDotProductBenchmark.dotProductOld       32  thrpt    5  152.221 ±  0.138  ops/us
   FloatDotProductBenchmark.dotProductOld       64  thrpt    5   85.443 ±  0.270  ops/us
   FloatDotProductBenchmark.dotProductOld      100  thrpt    5   53.636 ±  0.020  ops/us
   FloatDotProductBenchmark.dotProductOld      128  thrpt    5   42.828 ±  0.023  ops/us
   FloatDotProductBenchmark.dotProductOld      207  thrpt    5   26.981 ±  0.055  ops/us
   FloatDotProductBenchmark.dotProductOld      256  thrpt    5   21.944 ±  0.157  ops/us
   FloatDotProductBenchmark.dotProductOld      300  thrpt    5   18.856 ±  0.011  ops/us
   FloatDotProductBenchmark.dotProductOld      512  thrpt    5   11.330 ±  0.011  ops/us
   FloatDotProductBenchmark.dotProductOld      702  thrpt    5    8.150 ±  0.006  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5    5.814 ±  0.006  ops/us
   FloatSquareBenchmark.squareNew                1  thrpt    5  479.897 ±  0.414  ops/us
   FloatSquareBenchmark.squareNew                4  thrpt    5  347.548 ±  5.824  ops/us
   FloatSquareBenchmark.squareNew                6  thrpt    5  320.104 ±  3.070  ops/us
   FloatSquareBenchmark.squareNew                8  thrpt    5  272.376 ±  2.516  ops/us
   FloatSquareBenchmark.squareNew               13  thrpt    5  236.600 ±  1.357  ops/us
   FloatSquareBenchmark.squareNew               16  thrpt    5  225.289 ±  0.361  ops/us
   FloatSquareBenchmark.squareNew               25  thrpt    5  201.074 ±  0.363  ops/us
   FloatSquareBenchmark.squareNew               32  thrpt    5  222.044 ±  1.173  ops/us
   FloatSquareBenchmark.squareNew               64  thrpt    5  192.298 ±  2.776  ops/us
   FloatSquareBenchmark.squareNew              100  thrpt    5  131.676 ±  0.082  ops/us
   FloatSquareBenchmark.squareNew              128  thrpt    5  144.401 ±  1.032  ops/us
   FloatSquareBenchmark.squareNew              207  thrpt    5   85.532 ±  0.490  ops/us
   FloatSquareBenchmark.squareNew              256  thrpt    5   79.800 ±  0.023  ops/us
   FloatSquareBenchmark.squareNew              300  thrpt    5   67.323 ±  0.287  ops/us
   FloatSquareBenchmark.squareNew              512  thrpt    5   43.781 ±  0.029  ops/us
   FloatSquareBenchmark.squareNew              702  thrpt    5   27.687 ±  0.008  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5   20.829 ±  0.131  ops/us
   FloatSquareBenchmark.squareOld                1  thrpt    5  479.053 ±  0.635  ops/us
   FloatSquareBenchmark.squareOld                4  thrpt    5  345.476 ±  3.422  ops/us
   FloatSquareBenchmark.squareOld                6  thrpt    5  320.319 ±  0.476  ops/us
   FloatSquareBenchmark.squareOld                8  thrpt    5  347.901 ±  0.762  ops/us
   FloatSquareBenchmark.squareOld               13  thrpt    5  223.134 ±  0.950  ops/us
   FloatSquareBenchmark.squareOld               16  thrpt    5  213.299 ±  0.439  ops/us
   FloatSquareBenchmark.squareOld               25  thrpt    5  141.332 ±  1.297  ops/us
   FloatSquareBenchmark.squareOld               32  thrpt    5  117.027 ±  0.792  ops/us
   FloatSquareBenchmark.squareOld               64  thrpt    5   63.886 ±  0.034  ops/us
   FloatSquareBenchmark.squareOld              100  thrpt    5   42.088 ±  0.016  ops/us
   FloatSquareBenchmark.squareOld              128  thrpt    5   33.163 ±  0.007  ops/us
   FloatSquareBenchmark.squareOld              207  thrpt    5   19.855 ±  0.098  ops/us
   FloatSquareBenchmark.squareOld              256  thrpt    5   16.870 ±  0.043  ops/us
   FloatSquareBenchmark.squareOld              300  thrpt    5   14.326 ±  0.973  ops/us
   FloatSquareBenchmark.squareOld              512  thrpt    5    8.837 ±  0.047  ops/us
   FloatSquareBenchmark.squareOld              702  thrpt    5    6.299 ±  0.042  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5    4.351 ±  0.062  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557616126

   interesting, let me try it on my 256. if it doesn't hurt the performance (much), then let's go with it. i would prefer to have a "generalized" version like this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201270494


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   by the way, the slowness was there from the start, just initially sporadic because we only implemented 1 out of 6 functions.
   
   it becomes more noticeable and reproducible now that more functions are implemented. for example all 3 byte[] similarities are implemented so if you just run `./gradle check` it will never complete because `TestHnswByteVectorGraph` will always take an eternity.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1203597579


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/** A provider of VectorUtil implementations. */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  /** Returns the cosine similarity between the two vectors. */
+  float cosine(float[] v1, float[] v2);
+
+  /** Returns the sum of squared differences of the two vectors. */
+  float squareDistance(float[] a, float[] b);
+
+  /** Returns the dot product computed over signed bytes. */
+  int dotProduct(byte[] a, byte[] b);
+
+  /** Returns the cosine similarity between the two byte vectors. */
+  float cosine(byte[] a, byte[] b);
+
+  /** Returns the sum of squared differences of the two byte vectors. */
+  int squareDistance(byte[] a, byte[] b);
+
+  // -- provider lookup mechanism
+
+  static final Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) {
+      // is locale sane (only buggy in Java 20)
+      if (runtimeVersion <= 20 && !hasWorkingDefaultLocale()) {
+        LOG.warning(
+            "Java runtime is using a buggy default locale; Java vector incubator API can't be enabled: "
+                + Locale.getDefault());
+        return new VectorUtilDefaultProvider();
+      }
+      // is the incubator module present and readable (JVM providers may to exclude them or it is
+      // build with jlink)
+      if (!vectorModulePresentAndReadable()) {
+        LOG.warning(
+            "Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.");
+        return new VectorUtilDefaultProvider();
+      }
+      if (isClientVM()) {
+        LOG.warning("C2 compiler is disabled; Java vector incubator API can't be enabled");
+        return new VectorUtilDefaultProvider();
+      }
+      try {
+        // we use method handles with lookup, so we do not need to deal with setAccessible as we
+        // have private access through the lookup:
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (UnsupportedOperationException uoe) {
+          // not supported because preferred vector size too small or similar
+          LOG.warning("Java vector incubator API was not enabled. " + uoe.getMessage());
+          return new VectorUtilDefaultProvider();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "VectorUtilPanamaProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("VectorUtilPanamaProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  private static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  private static boolean hasWorkingDefaultLocale() {
+    return Runtime.version().update() > 1

Review Comment:
   OK, it works from the logic and Java 21 would be whitelisted automatically in the checks above... Maybe make a separate method for the whole thing and remove the runtime version check above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1203963839


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/** A provider of VectorUtil implementations. */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  /** Returns the cosine similarity between the two vectors. */
+  float cosine(float[] v1, float[] v2);
+
+  /** Returns the sum of squared differences of the two vectors. */
+  float squareDistance(float[] a, float[] b);
+
+  /** Returns the dot product computed over signed bytes. */
+  int dotProduct(byte[] a, byte[] b);
+
+  /** Returns the cosine similarity between the two byte vectors. */
+  float cosine(byte[] a, byte[] b);
+
+  /** Returns the sum of squared differences of the two byte vectors. */
+  int squareDistance(byte[] a, byte[] b);
+
+  // -- provider lookup mechanism
+
+  static final Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) {
+      // is locale sane (only buggy in Java 20)
+      if (runtimeVersion <= 20 && !hasWorkingDefaultLocale()) {
+        LOG.warning(
+            "Java runtime is using a buggy default locale; Java vector incubator API can't be enabled: "
+                + Locale.getDefault());
+        return new VectorUtilDefaultProvider();
+      }
+      // is the incubator module present and readable (JVM providers may to exclude them or it is
+      // build with jlink)
+      if (!vectorModulePresentAndReadable()) {
+        LOG.warning(
+            "Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.");
+        return new VectorUtilDefaultProvider();
+      }
+      if (isClientVM()) {
+        LOG.warning("C2 compiler is disabled; Java vector incubator API can't be enabled");
+        return new VectorUtilDefaultProvider();
+      }
+      try {
+        // we use method handles with lookup, so we do not need to deal with setAccessible as we
+        // have private access through the lookup:
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (UnsupportedOperationException uoe) {
+          // not supported because preferred vector size too small or similar
+          LOG.warning("Java vector incubator API was not enabled. " + uoe.getMessage());
+          return new VectorUtilDefaultProvider();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "VectorUtilPanamaProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("VectorUtilPanamaProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  private static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  private static boolean hasWorkingDefaultLocale() {
+    return Runtime.version().update() > 1

Review Comment:
   LGTM. Thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561223126

   I pushed changes to github.com/rmuir/vectorbench, too
   
   Results on my skylake with ` java -XX:UseAVX=0 -jar target/vectorbench.jar -p size=1024` look clean now. no crazy slowdowns for the integer math. consistent with what i see spinning up a QEMU, just easier to test yourself.
   
   ```
   Benchmark                                (size)   Mode  Cnt  Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5  0.791 ± 0.014  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5  0.802 ± 0.014  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  1.974 ± 0.028  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5  1.811 ± 0.440  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5  1.596 ± 0.043  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5  1.604 ± 0.022  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5  3.457 ± 0.028  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5  0.717 ± 0.009  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  8.613 ± 0.388  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5  1.969 ± 0.324  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  6.515 ± 0.147  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5  1.485 ± 0.015  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1584900739

   See the PR: #12362
   
   We could also catch the SecurityException, log  a warning and throw UOE, so it falls back to Lucene's VectorUtilProvider.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555926627

   I imporved the extractor even more (it had a small bug). It now only uses one pass, it just delays writing out the class file to apijar until all visible classes were collected into a set.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199625964


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/**
+ * A provider of VectorUtil implementations.
+ *
+ * @lucene.internal
+ */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  // -- provider lookup mechanism
+
+  Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20 && useVectorAPI() && vectorModulePresentAndReadable()) {
+      try {
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        // we use method handles, so we do not need to deal with setAccessible as we have private
+        // access through the lookup:
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "PanamaVectorUtilProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("PanamaVectorUtilProvider is missing in Lucene JAR file", cnfe);

Review Comment:
   classname changed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199630782


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/**
+ * A provider of VectorUtil implementations.
+ *
+ * @lucene.internal
+ */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  // -- provider lookup mechanism
+
+  Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20 && useVectorAPI() && vectorModulePresentAndReadable()) {

Review Comment:
   rewrote this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561857395

   I renamed the files and renamed some properties in the scripts


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1562390969

   If you cherry pick, take both commits. I will now update the Java 21 branch for mmap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199602161


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,10 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
-      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
+      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management,jdk.incubator.vector'

Review Comment:
   We should only enable jdk.incubator.vector if our tests are running with the exact Java version. This would fail e.g. on Java 11 (branch 9.x when backported) or with future versions when it went out of incubation.
   
   Maybe we should have a global gradle variable to list which features are anbled for which java version (like for panamaForeign).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1554917991

   I fixed it and merged the main branch into this one. Proceeding with fixing API generator to exclude unreferenced, private classes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199236929


##########
lucene/core/src/java/org/apache/lucene/internal/vector/VectorUtilProvider.java:
##########
@@ -76,4 +77,10 @@ static boolean vectorModulePresentAndReadable() {
     }
     return false;
   }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  @SuppressForbidden(reason = "required to determine if non-workable locale")
+  static boolean useVectorAPI() {
+    return 'I' == int.class.getSimpleName().toUpperCase().charAt(0);

Review Comment:
   Yeah, sorry for this - I was a bit distracted!  I just committed your suggestion, thanks. [0779fcb](https://github.com/apache/lucene/pull/12311/commits/0779fcb801cc668a943bb5bbca44df5272321f59)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553497986

   Is vector's FMA also always slow (does it use BigDecimal, too?).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553583461

   > I'd prefer to have separate apijars, because the current code compiles with patching base module.
   > On the other hand: it just works! 😉
   
   Yeah, this is a bit of a hack!. It would be better to separate these out, but then what would we do, still patch it into java.base or build a slim module around it or something? It doesn't feel any better than just patching it all into java.base!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198879166


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,13 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
+
+      // Disable assertions to workaround JDK-8301190
+      jvmArgs '-da:jdk.incubator.vector.LaneType'

Review Comment:
   I'm now questioning how safe it is to just suppress this. :-( 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557144955

   Hi,
   
   > I didn't get an anywhere with Luceneutil yet! :-( (I haven't been able to run it successfully, getting OOM errors )
   
   Did you get the OOMs only with our vector code? If it OOMs also with current code then you might need to tune Xmx for the large dataset.
   
   If it OOMs with the vector code it is the same that I have seen with Panama Foreign in Java 16/17. Reason was that the default settings of Mike's tool passed something like `-Xbatch` and disbaled tiered compilation. This caused escape analysis to be executed much later than expected. As the Panama Foreign code in the past created new MemorySegment instances (to produce shapes/slices) just to copy a few bytes this produced millions of new instances. This was solved by Mauricia by adding System.arraycopy-like copy methods to MemorySegment.
   
   For vectors it looks like we create really a lot of objects. When you run the searches in parallel with many threads in the benchmark it may also fill the heap faster than GC can clean it up or the optimizer kicks in.
   
   P.S.: This could also be th reason for the slowdown I mentioned above: In the testsuite we also disable tiered complilation by default for performance reasons when just running unit tests. But it is bad for tests doing a lot of work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201706152


##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,455 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(float[] a, float[] b) {
+    int i = 0;
+    float sum = 0;
+    float norm1 = 0;
+    float norm2 = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector sum1 = FloatVector.zero(SPECIES);
+      FloatVector sum2 = FloatVector.zero(SPECIES);
+      FloatVector sum3 = FloatVector.zero(SPECIES);
+      FloatVector sum4 = FloatVector.zero(SPECIES);
+      FloatVector norm1_1 = FloatVector.zero(SPECIES);
+      FloatVector norm1_2 = FloatVector.zero(SPECIES);
+      FloatVector norm1_3 = FloatVector.zero(SPECIES);
+      FloatVector norm1_4 = FloatVector.zero(SPECIES);
+      FloatVector norm2_1 = FloatVector.zero(SPECIES);
+      FloatVector norm2_2 = FloatVector.zero(SPECIES);
+      FloatVector norm2_3 = FloatVector.zero(SPECIES);
+      FloatVector norm2_4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        sum2 = sum2.add(vc.mul(vd));
+        norm1_2 = norm1_2.add(vc.mul(vc));
+        norm2_2 = norm2_2.add(vd.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        sum3 = sum3.add(ve.mul(vf));
+        norm1_3 = norm1_3.add(ve.mul(ve));
+        norm2_3 = norm2_3.add(vf.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        sum4 = sum4.add(vg.mul(vh));
+        norm1_4 = norm1_4.add(vg.mul(vg));
+        norm2_4 = norm2_4.add(vh.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+      }
+      // reduce
+      FloatVector sumres1 = sum1.add(sum2);
+      FloatVector sumres2 = sum3.add(sum4);
+      FloatVector norm1res1 = norm1_1.add(norm1_2);
+      FloatVector norm1res2 = norm1_3.add(norm1_4);
+      FloatVector norm2res1 = norm2_1.add(norm2_2);
+      FloatVector norm2res2 = norm2_3.add(norm2_4);
+      sum += sumres1.add(sumres2).reduceLanes(VectorOperators.ADD);
+      norm1 += norm1res1.add(norm1res2).reduceLanes(VectorOperators.ADD);
+      norm2 += norm2res1.add(norm2res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float elem1 = a[i];
+      float elem2 = b[i];
+      sum += elem1 * elem2;
+      norm1 += elem1 * elem1;
+      norm2 += elem2 * elem2;
+    }
+    return (float) (sum / Math.sqrt(norm1 * norm2));
+  }
+
+  @Override
+  public float squareDistance(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff1 = va.sub(vb);
+        acc1 = acc1.add(diff1.mul(diff1));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        FloatVector diff2 = vc.sub(vd);
+        acc2 = acc2.add(diff2.mul(diff2));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        FloatVector diff3 = ve.sub(vf);
+        acc3 = acc3.add(diff3.mul(diff3));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        FloatVector diff4 = vg.sub(vh);
+        acc4 = acc4.add(diff4.mul(diff4));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff = va.sub(vb);
+        acc1 = acc1.add(diff.mul(diff));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float diff = a[i] - b[i];
+      res += diff * diff;
+    }
+    return res;
+  }
+
+  // Binary functions, these all follow a general pattern like this:
+  //
+  //   short intermediate = a * b;
+  //   int accumulator = accumulator + intermediate;
+  //
+  // 256 or 512 bit vectors can process 64 or 128 bits at a time, respectively
+  // intermediate results use 128 or 256 bit vectors, respectively
+  // final accumulator uses 256 or 512 bit vectors, respectively
+  //
+  // We also support 128 bit vectors, using two 128 bit accumulators.
+  // This is slower but still faster than not vectorizing at all.
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  static final int INT_SPECIES_PREFERRED_BIT_SIZE = IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // compute vectorized dot product consistent with VPDPBUSD instruction, acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short product = (short) (x[i] * y[i]);
+      //   sum += product;
+      // }
+      if (INT_SPECIES_PREFERRED_BIT_SIZE >= 256) {
+        // optimized 256/512 bit implementation, processes 8/16 bytes at a time
+        int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length);
+        IntVector acc = IntVector.zero(IntVector.SPECIES_PREFERRED);
+        for (; i < upperBound; i += PREFERRED_BYTE_SPECIES.length()) {
+          ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i);
+          ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i);
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          Vector<Integer> prod32 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_PREFERRED, 0);
+          acc = acc.add(prod32);
+        }
+        // reduce
+        res += acc.reduceLanes(VectorOperators.ADD);
+      } else {
+        // 128-bit implementation, which must "split up" vectors due to widening conversions
+        int upperBound = ByteVector.SPECIES_64.loopBound(a.length);
+        IntVector acc1 = IntVector.zero(IntVector.SPECIES_128);
+        IntVector acc2 = IntVector.zero(IntVector.SPECIES_128);
+        for (; i < upperBound; i += ByteVector.SPECIES_64.length()) {
+          ByteVector va8 = ByteVector.fromArray(ByteVector.SPECIES_64, a, i);
+          ByteVector vb8 = ByteVector.fromArray(ByteVector.SPECIES_64, b, i);
+          // expand each byte vector into short vector and multiply
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          // split each short vector into two int vectors and add
+          Vector<Integer> prod32_1 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 0);
+          Vector<Integer> prod32_2 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 1);
+          acc1 = acc1.add(prod32_1);
+          acc2 = acc2.add(prod32_2);
+        }
+        // reduce
+        res += acc1.add(acc2).reduceLanes(VectorOperators.ADD);
+      }
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(byte[] a, byte[] b) {
+    int i = 0;
+    int sum = 0;
+    int norm1 = 0;
+    int norm2 = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short difference = (short) (x[i] - y[i]);
+      //   sum += (int) difference * (int) difference;
+      // }

Review Comment:
   This comment block is incorrect. I was going to fix it, since I kinda like the outline of the scalar code before reading the vectorized version, but then remembered that the tail of each of these will be exactly this. I like these comments, but they are mostly superfluous.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559929746

   > Sorry for again changing. I was not happy with the bitsize code in the lookup function. I used a very simple approach: The constructor of the provider throws UnsupportedOperationException with a message. The provider lookup catches this exception and falls back to default provider. This looks cleanest to me.
   
   Thanks! Do you think we should nuke related 128-bit checks in the binary methods of the file? Or replace them with asserts?
   
   Example:
   https://github.com/apache/lucene/blob/f9711e9a30f3e4be6cf237091b6642b222178325/lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java#L273


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1203651668


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/** A provider of VectorUtil implementations. */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  /** Returns the cosine similarity between the two vectors. */
+  float cosine(float[] v1, float[] v2);
+
+  /** Returns the sum of squared differences of the two vectors. */
+  float squareDistance(float[] a, float[] b);
+
+  /** Returns the dot product computed over signed bytes. */
+  int dotProduct(byte[] a, byte[] b);
+
+  /** Returns the cosine similarity between the two byte vectors. */
+  float cosine(byte[] a, byte[] b);
+
+  /** Returns the sum of squared differences of the two byte vectors. */
+  int squareDistance(byte[] a, byte[] b);
+
+  // -- provider lookup mechanism
+
+  static final Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) {
+      // is locale sane (only buggy in Java 20)
+      if (runtimeVersion <= 20 && !hasWorkingDefaultLocale()) {
+        LOG.warning(
+            "Java runtime is using a buggy default locale; Java vector incubator API can't be enabled: "
+                + Locale.getDefault());
+        return new VectorUtilDefaultProvider();
+      }
+      // is the incubator module present and readable (JVM providers may to exclude them or it is
+      // build with jlink)
+      if (!vectorModulePresentAndReadable()) {
+        LOG.warning(
+            "Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.");
+        return new VectorUtilDefaultProvider();
+      }
+      if (isClientVM()) {
+        LOG.warning("C2 compiler is disabled; Java vector incubator API can't be enabled");
+        return new VectorUtilDefaultProvider();
+      }
+      try {
+        // we use method handles with lookup, so we do not need to deal with setAccessible as we
+        // have private access through the lookup:
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (UnsupportedOperationException uoe) {
+          // not supported because preferred vector size too small or similar
+          LOG.warning("Java vector incubator API was not enabled. " + uoe.getMessage());
+          return new VectorUtilDefaultProvider();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "VectorUtilPanamaProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("VectorUtilPanamaProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  private static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  private static boolean hasWorkingDefaultLocale() {
+    return Runtime.version().update() > 1

Review Comment:
   all fine, i have code here that uses `Version.parse("20.0.2")` and compares to actual runtime. I moved all checks to the separate method, so it is all at same place.
   
   I can commit this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1203861128


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/** A provider of VectorUtil implementations. */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  /** Returns the cosine similarity between the two vectors. */
+  float cosine(float[] v1, float[] v2);
+
+  /** Returns the sum of squared differences of the two vectors. */
+  float squareDistance(float[] a, float[] b);
+
+  /** Returns the dot product computed over signed bytes. */
+  int dotProduct(byte[] a, byte[] b);
+
+  /** Returns the cosine similarity between the two byte vectors. */
+  float cosine(byte[] a, byte[] b);
+
+  /** Returns the sum of squared differences of the two byte vectors. */
+  int squareDistance(byte[] a, byte[] b);
+
+  // -- provider lookup mechanism
+
+  static final Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) {
+      // is locale sane (only buggy in Java 20)
+      if (runtimeVersion <= 20 && !hasWorkingDefaultLocale()) {
+        LOG.warning(
+            "Java runtime is using a buggy default locale; Java vector incubator API can't be enabled: "
+                + Locale.getDefault());
+        return new VectorUtilDefaultProvider();
+      }
+      // is the incubator module present and readable (JVM providers may to exclude them or it is
+      // build with jlink)
+      if (!vectorModulePresentAndReadable()) {
+        LOG.warning(
+            "Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.");
+        return new VectorUtilDefaultProvider();
+      }
+      if (isClientVM()) {
+        LOG.warning("C2 compiler is disabled; Java vector incubator API can't be enabled");
+        return new VectorUtilDefaultProvider();
+      }
+      try {
+        // we use method handles with lookup, so we do not need to deal with setAccessible as we
+        // have private access through the lookup:
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (UnsupportedOperationException uoe) {
+          // not supported because preferred vector size too small or similar
+          LOG.warning("Java vector incubator API was not enabled. " + uoe.getMessage());
+          return new VectorUtilDefaultProvider();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "VectorUtilPanamaProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("VectorUtilPanamaProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  private static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  private static boolean hasWorkingDefaultLocale() {
+    return Runtime.version().update() > 1

Review Comment:
   Done in https://github.com/apache/lucene/pull/12311/commits/e3ea49fa8ef016d17efb61b863e607f25ab30d88



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557985868

   and here's my m1 for this one:
   ```
   Benchmark                                (size)   Mode  Cnt    Score   Error   Units
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5    6.142 ± 0.019  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5    3.108 ± 0.003  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201265341


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   I'm not sure we should add this to jvm args by default?
   
   Currently we run our tests with C2 disabled (like any sane person): massive speedup to the build. Otherwise lucene's test suite is basically just a test of C2.
   
   CI builds (like policeman jenkins) can run by overriding JVM args so they get C2 and different garbage collectors and all that. They could explicitly add this safely?
   
   Unfortunately the vector API is unusably slow without C2: like it will never finish. Really they should just throw exception rather than fallback to insanely slow stuff: or do something differently here. The "fallback" scalar implementation is slower than a Galapagos turtle.
   
   I guess to summarize, it isn't enough to just add `--add-modules jdk.incubator.vector`, you need to turn on C2 as well or the tests will never finish. So I think we shouldn't automagically do it?
   
   Or maybe on our side, we can actually detect that C2 is disabled in our provider mechanism (similar to the Locale check)? We should protect our users and developers from the horrorshow of the "fallback" performance of the vector api and just use scalar functions.
   
   As a workaround, I added the following to my `~/.gradle/gradle.properties`:
   ```
   # enables c2 (slow!) for vector stuff
   tests.jvmargs=-XX:+UseParallelGC -XX:ActiveProcessorCount=1
   ```
   
   It makes my tests run twice as slow for this branch but at least they finish before our sun goes supernova, so I can run them before pushing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561247988

   I ran luceneutil with GloVe 300-dim floating point (fp32) vectors over 1M wikipedia documents:
   
   ```
                            TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
                           PKLookup      196.01      (3.8%)      192.14      (3.8%)   -2.0% (  -9% -    5%) 0.099
                      LowTermVector      213.57      (7.2%)      252.31      (3.6%)   18.1% (   6% -   31%) 0.000
                   AndHighLowVector      185.28      (6.8%)      221.08      (3.5%)   19.3% (   8% -   31%) 0.000
                   AndHighMedVector      125.91      (5.7%)      152.52      (2.5%)   21.1% (  12% -   31%) 0.000
                     HighTermVector      171.95      (7.3%)      208.94      (3.3%)   21.5% (  10% -   34%) 0.000
                  AndHighHighVector      123.87      (5.0%)      151.81      (2.9%)   22.6% (  14% -   32%) 0.000
                      MedTermVector      119.07      (7.5%)      148.07      (2.8%)   24.4% (  13% -   37%) 0.000
   ```
   
   and with GloVe 100-dim 8-bit vectors
   
   ```                            TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
                           PKLookup      190.59      (7.4%)      193.25      (5.1%)    1.4% ( -10% -   14%) 0.486
                      LowTermVector      291.71     (24.0%)      341.91     (14.3%)   17.2% ( -17% -   73%) 0.006
                   AndHighMedVector      230.40     (22.6%)      274.26     (13.0%)   19.0% ( -13% -   70%) 0.001
                      MedTermVector      245.36     (22.7%)      292.35     (11.9%)   19.2% ( -12% -   69%) 0.001
                     HighTermVector      296.45     (25.6%)      357.02      (9.8%)   20.4% ( -11% -   75%) 0.001
                   AndHighLowVector      252.70     (23.2%)      308.05     (13.7%)   21.9% ( -12% -   76%) 0.000
                  AndHighHighVector      150.54     (21.0%)      185.45     (13.4%)   23.2% (  -9% -   72%) 0.00
   ```
   
   I also tried getting some vectors using a different model that produces 384-dim fp32 vectors (`all-MiniLM-L6-v2` from https://www.sbert.net/docs/pretrained_models.html). The methodology here is a bit sus because we compute embedding vectors per-word and then sum them over larger docs, whereas these models are really designed to be computed on larger passages so they can make use of word context. Still I think the performance measurements will be valid.
   
   ```
   TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
                           PKLookup      173.59      (8.5%)      176.41      (5.7%)    1.6% ( -11% -   17%) 0.477
                  AndHighHighVector      309.15     (26.1%)      346.54     (18.1%)   12.1% ( -25% -   76%) 0.089
                      LowTermVector      305.52     (26.4%)      343.83     (15.9%)   12.5% ( -23% -   74%) 0.069
                      MedTermVector      312.58     (26.6%)      352.51     (18.5%)   12.8% ( -25% -   78%) 0.078
                     HighTermVector      300.84     (30.4%)      345.35     (18.8%)   14.8% ( -26% -   92%) 0.064
                   AndHighMedVector      303.15     (27.8%)      349.09     (18.2%)   15.2% ( -24% -   84%) 0.041
                   AndHighLowVector      233.11     (21.9%)      285.00     (12.5%)   22.3% (  -9% -   72%) 0.000 
   ```
   
   I was surprised this showed less improvement than the smaller vectors but there is a lot of noise in these benchmarks. I see the results vary quite a bit from run to run (even averaging over 20 JVMs). I'm currently training up some 768-dim vectors using `all-mpnet-base-v` and I'll see if I can get measurements from KnnGraphTester that should be more focused. These tests were run with 609fc9b63f61954a7408faa1669e807a6bbf1da9 so maybe a few commits back.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557853835

   > > btw, i think there's a real bug in the SPECIES_PREFERRED stuff that makes testing such degenerate cases _really difficult_. You should be able to just pass `-XX:MaxVectorSize=8` or `-XX:UseAVX=0` or similar to test it out: but this won't change SPECIES_PREFERRED, only make it dog slow. its like it gets the wrong information from the compiler.
   > > the only way i know right now is to spin up QEMU and take cpu flags away :)
   > 
   > Let's open a bug report. I can do it in the openjdk bug tracker.
   
   For testing, it would certainly be incredibly useful to have downward configurability over the PREFERRED species.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559159143

   I ran this on the `luceneutil` nightly benchmark box (`beast3`):
   
   This is an AMD Ryzen Threadripper 3990X 64-Core Processor, with this long list of CPU flags AND bugs:
   
   ```
   flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sev sev_es
   bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass retbleed smt_rsb
   ```
   
   ```
   # Run complete. Total time: 00:04:50
   
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   5.286 ± 0.005  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   0.691 ± 0.021  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  13.896 ± 0.009  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   1.980 ± 0.027  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5  11.880 ± 0.011  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   1.724 ± 0.018  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5  12.201 ± 0.260  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   1.163 ± 0.015  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  17.963 ± 0.111  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   3.775 ± 0.347  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  15.641 ± 0.092  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   2.684 ± 0.014  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556277802

   Here are the results of running @rmuir's benchmark (exactly from https://github.com/rmuir/vectorbench, no edits to the code)
   
   Linux-x64, Intel Rocket Lake:  11th Gen Intel(R) Core(TM) i5-11400 @ 2.60GHz
   jdk.incubator.vector.FloatVector.SPECIES_PREFERRED  ==> Species[float, 16, S_512_BIT]
   ```
   jdk-20.0.1/bin/java -jar target/vectorbench.jar -psize=1024
   ...
   Benchmark                             (size)   Mode  Cnt   Score   Error   Units
   DotProductBenchmark.dotProductNew       1024  thrpt    5  22.412 ± 0.059  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5  19.027 ± 0.439  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5   3.476 ± 0.038  ops/us
   ```
   ---
   
   Apple M1, macOS 13.3.1
   jdk.incubator.vector.FloatVector.SPECIES_PREFERRED ==> Species[float, 4, S_128_BIT]
   ```
   jdk-20.0.1.jdk/Contents/Home/bin/java -jar target/vectorbench.jar -psize=1024
   ...
   Benchmark                             (size)   Mode  Cnt   Score   Error   Units
   DotProductBenchmark.dotProductNew       1024  thrpt    5   9.328 ± 0.027  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5  14.640 ± 0.101  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5   3.740 ± 0.042  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556405150

   and here's the results on my aarch64 mac, which has only 128-bit vectors and gets that disappointing generic impl:
   ```
   Benchmark                                (size)   Mode  Cnt    Score   Error   Units
   BinaryDotProductBenchmark.dotProductNew       1  thrpt    5  334.839 ± 0.368  ops/us
   BinaryDotProductBenchmark.dotProductNew     128  thrpt    5   34.097 ± 0.026  ops/us
   BinaryDotProductBenchmark.dotProductNew     207  thrpt    5   22.045 ± 0.063  ops/us
   BinaryDotProductBenchmark.dotProductNew     256  thrpt    5   18.782 ± 0.603  ops/us
   BinaryDotProductBenchmark.dotProductNew     300  thrpt    5   15.932 ± 0.049  ops/us
   BinaryDotProductBenchmark.dotProductNew     512  thrpt    5    9.985 ± 0.010  ops/us
   BinaryDotProductBenchmark.dotProductNew     702  thrpt    5    7.322 ± 0.001  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5    5.146 ± 0.002  ops/us
   BinaryDotProductBenchmark.dotProductOld       1  thrpt    5  343.722 ± 0.640  ops/us
   BinaryDotProductBenchmark.dotProductOld     128  thrpt    5   24.981 ± 0.102  ops/us
   BinaryDotProductBenchmark.dotProductOld     207  thrpt    5   14.944 ± 0.128  ops/us
   BinaryDotProductBenchmark.dotProductOld     256  thrpt    5   12.541 ± 0.006  ops/us
   BinaryDotProductBenchmark.dotProductOld     300  thrpt    5   10.663 ± 0.005  ops/us
   BinaryDotProductBenchmark.dotProductOld     512  thrpt    5    6.198 ± 0.019  ops/us
   BinaryDotProductBenchmark.dotProductOld     702  thrpt    5    4.531 ± 0.062  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5    3.108 ± 0.003  ops/us
   ```
   
   I'd be curious how we could implement this with better performance, especially if we can just have a single generic impl like the float one. I feel like I must be doing it wrong :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1584844845

   Currently in ES we have two potential ways to workaround this - they are both crummy. :-( 
    1. Trigger initialisation of VectorUtils, and do a small operation, before executing a painless script, or
    2. Add a couple of permissions to the painless environment.
    
   Not sure if Lucene could/should have a workaround hack put in place. It would look something like: the do some dummy vector operation in the Panama VectorUtil clinit, while asserting permissions (in a doPriv block). That would force the JDK's vector implementation to initialise the class where the JDK-bug lives, while asserting the property permission. This assumes that Lucene is granted such a permission - it's a property read, which is not all that interesting
   
   The above is kinda similar the workaround, point no.1 above, that we're considering for ES (and yes we can do it in ES, but it could bit other Lucene consumers if not done in Lucene) ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1584892632

   I implemented a workaround for the initialization failure, PR coming in a moment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] markrmiller commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "markrmiller (via GitHub)" <gi...@apache.org>.
markrmiller commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1575177176

   To be fair, the claim was because you can’t have it because the efficiency
   cores don’t have it. But that’s a little suspect when as soon as some bios’
   let you enable it without the efficiency cores, they quickly put an end to
   that in silicon. I think only the recent zen 4 processors have had it with
   AMD. My 7950X does. I don’t know about those efficiency cores - I like them
   in my MacBook, but I think I’ll continue to avoid them in the desktop
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557873260

   > > > btw, i think there's a real bug in the SPECIES_PREFERRED stuff that makes testing such degenerate cases _really difficult_. You should be able to just pass `-XX:MaxVectorSize=8` or `-XX:UseAVX=0` or similar to test it out: but this won't change SPECIES_PREFERRED, only make it dog slow. its like it gets the wrong information from the compiler.
   > > > the only way i know right now is to spin up QEMU and take cpu flags away :)
   > > 
   > > 
   > > Let's open a bug report. I can do it in the openjdk bug tracker.
   > 
   > For testing, it would certainly be incredibly useful to have downward configurability over the PREFERRED species.
   
   Not only for testing. It is a bug. If somebody for example setups `UseAVX=2` like Elasticsearch does/did by default, it still configures the preferred species to 512 bits. As Hotspot can't handle this, it will use the fallback code and that's slow like hell.
   
   Actually we checked the code. It looks like the preferred species are dynamically initialized, bit the VM option is not taken into account. I think it call the wrong macro when it gets maximum lane size.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558073763

   I pushed a BinaryCosine benchmark as well, also similar stuff, just a more complex formula:
   
   mac m1:
   ```
   Benchmark                                (size)   Mode  Cnt  Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5  2.260 ± 0.005  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5  1.046 ± 0.002  ops/us
   ```
   
   skylake:
   ```
   Benchmark                                (size)   Mode  Cnt  Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5  3.658 ± 0.045  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5  0.799 ± 0.011  ops/us
   ```
   
   I'll push these functions to the branch just to have more completeness and because there were zero surprises or anything to think about, just rote work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201852487


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   ++ Sounds good. I wonder if it's even possible to run the basic TestVectorUtil twice - with both implementations? But that would require some gradle foo or something ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201948213


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   > We can easily compare the outputs of both implementations  ... for floats with an epsilon.
   
   Yes, this is what I was thinking.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557146310

   @msokolov Can you assist us with how to run Mike's luceneutil bench to get best insights to vector code? The default query benchmark has no support for vectors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557635741

   if you want to fix the correctness issue for the no-vectors-supported case, just add a guard that supported vector size is at least 128 bits. It must be at least 128 so that you can divide it by 2 and still have a valid species (there is no SPECIES_32, root of my problems here lol)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557660515

   i think another approach would be to generalize the 256 algorithm (no splitting into parts) to also work with 512? No need to have a separate `if` for 512 when its the same algo i think?
   
   but right now, i can't see us avoiding two algorithms: one with splitting (128-bit vectors, since there isnt a ByteVectors.SPECIES_32), and another one without splitting (256/512-bit vecotrs).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557981841

   Working my way thru all the vector similarity functions, I pushed initial stab at the binary euclidean distance to https://github.com/rmuir/vectorbench
   run it with `java -jar target/vectorbench.jar Square`
   
   It uses the same structure as what we did for binary dotproduct and can share the constants with it, if we go with this approach. These integer ones are quick to iterate on as everything is exact and benchmark is also a randomized test case :)
   
   Here's my skylake:
   ```
   Benchmark                                (size)   Mode  Cnt    Score    Error   Units
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5    6.331 ±  0.121  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5    1.548 ±  0.186  ops/us
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201912353


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   Not for the FP functions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559067381

   I added a test comparing both providers (executes if the VectorUtil's provider is different from Lucene's.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198140102


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##########
@@ -270,4 +216,134 @@ public static float dotProductScore(byte[] a, byte[] b) {
     float denom = (float) (a.length * (1 << 15));
     return 0.5f + dotProduct(a, b) / denom;
   }
+
+  interface VectorUtilProvider {
+
+    // just dot product for now
+    float dotProduct(float[] a, float[] b);
+  }
+
+  private static VectorUtilProvider lookupProvider() {
+    // TODO: add a check
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) { // TODO: do we want JDK 19?
+      try {
+        ensureReadability();
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.JDKVectorUtilProvider");
+        // we use method handles, so we do not need to deal with setAccessible as we have private
+        // access through the lookup:
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError("JDKVectorUtilProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("JDKVectorUtilProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new LuceneVectorUtilProvider();
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static <T> T doPrivileged(PrivilegedAction<T> action) {
+    return AccessController.doPrivileged(action);
+  }
+
+  static void ensureReadability() {
+    ModuleLayer.boot().modules().stream()
+        .filter(m -> m.getName().equals("jdk.incubator.vector"))
+        .findFirst()
+        .ifPresentOrElse(
+            vecMod -> VectorUtilProvider.class.getModule().addReads(vecMod),
+            () -> LOG.warning("vector incubator module not present"));
+  }
+
+  static {
+    PROVIDER =

Review Comment:
   I think we do not need any AccessController here. In MMapDirectory it is only there for the legacy code using sun.misc.Unsafe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553586455

   > Is vector's FMA also always slow (does it use BigDecimal, too?).
   
   I dunno what it does - I haven't looked - but I doubt it falls back to BD. I'll take a look and do some experiments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199250652


##########
lucene/core/src/java/org/apache/lucene/internal/vector/DefaultVectorUtilProvider.java:
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.internal.vector;
+
+/**
+ * The default VectorUtil provider implementation.
+ *
+ * @lucene.internal
+ */
+public final class DefaultVectorUtilProvider implements VectorUtilProvider {

Review Comment:
   I put things back, with a little renaming. All should be package-private. [d3fe388](https://github.com/apache/lucene/pull/12311/commits/d3fe38815b4d918577704e5a5df06d904001ce7f)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555906079

   > There's something strange on my computer when running core tests with Java 20:
   > 
   > ```
   > :lucene:core:test (SUCCESS): 5730 test(s), 193 skipped
   > The slowest tests (exceeding 500 ms) during this run:
   >   236.22s TestHnswFloatVectorGraph.testRamUsageEstimate (:lucene:core)
   > ```
   > 
   > With Java 17:
   > 
   > ```
   > :lucene:core:test (SUCCESS): 16 test(s)
   > The slowest tests (exceeding 500 ms) during this run:
   >    2.51s TestHnswFloatVectorGraph.testRamUsageEstimate (:lucene:core)
   >    0.98s TestHnswFloatVectorGraph.testSortedAndUnsortedIndicesReturnSameResults (:lucene:core)
   > The slowest suites (exceeding 1s) during this run:
   >    5.25s TestHnswFloatVectorGraph (:lucene:core)
   > ```
   > 
   > Uwe
   
   This does not seem to be always the case. Sometimes it is damn slow, sometimes very fast? Maybe has to do with randomization?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555962105

   Hi,
   
   > 1. I'm in the process of preparing a luceneutil run - still downloading on my Linux box. Then I'll try to get some comparative numbers from the benchmark - I'm not quite sure exactly what to do here, but I've run this before a while back so I'll figure it out.
   
   Be sure to use the large wikimedium file and tasks, otherwise the benchmark is pointless (that's what I figured out). Tune it so each single run takes at least a minute. You can add the add-module in the config file where the JVM command lines are specified.
   
   I will try to run a benchmark on Policeman Jenkins (which has AMD Ryzen CPU) and also post results. I have some prepared stuff available. I have still no idea how to run the vector benchmarks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199144041


##########
lucene/core/src/java/org/apache/lucene/internal/vector/DefaultVectorUtilProvider.java:
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.internal.vector;
+
+/**
+ * The default VectorUtil provider implementation.
+ *
+ * @lucene.internal
+ */
+public final class DefaultVectorUtilProvider implements VectorUtilProvider {

Review Comment:
   I was on the fence about this too. I’ll revert it - put it back in the original package and depend upon package-private access.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560683064

   Trivially, the JDK Vector source (at least in the comments) seems to describe what we expect, but this is not the behaviour that we observe.
   
   VectorShape.java
   ```
       static int getMaxVectorBitSize(Class<?> etype) {
           // VectorSupport.getMaxLaneCount may return -1 if C2 is not enabled,
           // or a value smaller than the S_64_BIT.vectorBitSize / elementSizeInBits if MaxVectorSize < 16
           // If so default to S_64_BIT
           int maxLaneCount = VectorSupport.getMaxLaneCount(etype);
           int elementSizeInBits = LaneType.of(etype).elementSize;
           return Math.max(maxLaneCount * elementSizeInBits, S_64_BIT.vectorBitSize);
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560143502

   my issues with maxvectorsize/preferred was a jshell luser issue. it really works, so its easy for me to simulate 64/128 bit vectors on my 256-bit machine.
   
   i am still fine with disabling 64-bit vectors, but i am not sure we need to do it. We could add `-XX:MaxVectorSize=8, -XX:MaxVectorSize=16, -XX:MaxVectorSize=32, ...` to jenkins randomization if we want to ensure correctness?
   
   the 64-bit vectors (via `-XX:MaxVectorSize=8`) don't cause trappy behavior for me. For our integer functions, 64-bit only vectors are already skipped by the check mentioned above (we don't code up an algorithm for them), so there's no performance regression but also no gain. 64-bit only vectors still give some speedup (e.g. 2x) for floats on intel. And I'm not worried about correctness issues given we have an easy way to test.
   
   The bigger trap is what happens when you disable AVX: `-XX:UseAVX=0`. Supplying this flag gives incredibly trappy performance. IntVector.SPECIES_PREFERRED is still set to 128-bits, but there is no real vectorization for the integer functions and they run incredibly slow. floating point functions are still much faster, and work fine. I'm not sure what all UseAVX=0 is doing so maybe I need to spin up a QEMU that isn't configured well, "hiding" the AVX cpu flags, to see if it behaves like `-XX:UseAVX=0`. But this is definitely the kind of situation i'd like to avoid, and we don't detect it.
   
   So we are currently guarding against the wrong thing.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561106972

   See commit above. Hopefully it is ok?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1562381675

   Thank you @rmuir and @uschindler, this has been fun! 🚀 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559582073

   I'm trying to run luceneutil benchmarks but there are confounding factors - we spend a *lot* of time parsing the vector dictionary to create query-vectors. I think we need to fix this before we have any hope of measuring anything with that benchmark :( I trust KnnGraphTester more - it's more focused on the vector use case; like a mini-benchmark (not micro), so maybe I'll start with that ... it will create an index and exercise the lucene HNSW search API


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] HoustonPutman commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "HoustonPutman (via GitHub)" <gi...@apache.org>.
HoustonPutman commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561571022

   Just ran the benchmarks on an m2 max machine, to see if it improved from the m1 performance (it didn't):
   
   java -jar target/vectorbench.jar -p size=1024
   ```
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   2.442 ± 0.017  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   1.016 ± 0.359  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   6.495 ± 0.269  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   3.150 ± 0.120  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   6.544 ± 0.504  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   3.102 ± 0.088  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   8.113 ± 0.168  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   1.147 ± 0.030  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  16.719 ± 0.287  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   3.814 ± 0.198  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  14.941 ± 0.529  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   3.321 ± 0.068  ops/us
   ```
   
   java -jar target/vectorbench.jar -p size=1024  -t max
   ```
   Benchmark                                (size)   Mode  Cnt    Score    Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   20.388 ±  3.046  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   10.483 ±  1.104  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   56.619 ±  4.635  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   30.796 ±  0.797  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   55.987 ±  3.258  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   28.751 ±  2.661  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   65.090 ±  6.555  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   11.113 ±  1.230  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  135.249 ±  8.125  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   33.419 ±  1.708  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  110.014 ±  7.691  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   26.892 ± 15.677  ops/us
   ```
   
   It'd be interesting to see how this/java plays with the Gravitron3 which is ARMv8 with 2x256 bit SVE.
   
   Thanks for the amazing work here y'all!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557241117

   i sped up the binary dotproduct some for the 128-bit case by doing similar thing, using ByteVector.SPECIES_64.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557343456

   In luceneutil there is a python script called `vector-test.py` that you can use to run performance tests for vector search. It was a little messed up; I just pushed a change to make it use the 1M index and removed some other faceting tasks that weren't needed, so please checkout the latest before trying that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558205450

   I pushed a float euclidean benchmark (FloatSquareBenchmark). same shape as the dotproduct float, no surprises:
   
   skylake:
   ```
   Benchmark                       (size)   Mode  Cnt  Score   Error   Units
   FloatSquareBenchmark.squareNew    1024  thrpt    5  9.709 ± 0.233  ops/us
   FloatSquareBenchmark.squareOld    1024  thrpt    5  1.415 ± 0.035  ops/us
   ```
   
   m1:
   ```
   Benchmark                       (size)   Mode  Cnt   Score   Error   Units
   FloatSquareBenchmark.squareNew    1024  thrpt    5  14.339 ± 0.004  ops/us
   FloatSquareBenchmark.squareOld    1024  thrpt    5   3.180 ± 0.011  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201916237


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   Yes, of course. What would the expected delta be for such?  ( it would only make sense if it is quite small, right? )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559226152

   > I ran this on the `luceneutil` nightly benchmark box (`beast3`):
   
   Anybody with a real Luceneutil vector-benchmark of the whole luceneutil testsuite?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559867708

   All bitsizes in the lengthy flame war on mailinglist was mostly powers of two (except maybe 1536). But we could maybe split into different chunk sizes? So to multiply size 768 vectors use 512 for first chunk and 256 for second?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty closed pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty closed pull request #12311: Integrate the Incubating Panama Vector API 
URL: https://github.com/apache/lucene/pull/12311


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561257123

   Hi Thanks for the data. Doe snot look bad. We see an improvement.
   
   > These tests were run with [609fc9b](https://github.com/apache/lucene/commit/609fc9b63f61954a7408faa1669e807a6bbf1da9) so maybe a few commits back.
   
   That's all fine, the changes were only cosmetic and to dodge around bad CPU / misconfigured environments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561601759

   I am not sure how implemented SVE is, but would love to try it out since we don't have the 128-bit vector restriction.
   
   on the ARM with this branch, the floating point algorithms are fast, but the integer algorithms are slowish (only 2x speedup). it is not because of the ARM, it is because of the vector API. I saw the same results with 128-bit vectors on intel too.
   
   processing vector in "parts" is very slow (e.g. split bytevector into two shortvectors). It is so slow that it is actually faster to just ignore the second "part", and only process half of the bytevector in each loop iteration, doing overlapping reads and re-reading the second part in the next! (No i don't want to change the code to do this).
   
   I still wanted to try just "casting" byte to short my own self without using the jdk type conversion support. just means applying a shuffle and AND'ing with a mask. ugly as hell but probably speeds up that 128-bit stuff on ARM. have not tried yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1203593016


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/** A provider of VectorUtil implementations. */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  /** Returns the cosine similarity between the two vectors. */
+  float cosine(float[] v1, float[] v2);
+
+  /** Returns the sum of squared differences of the two vectors. */
+  float squareDistance(float[] a, float[] b);
+
+  /** Returns the dot product computed over signed bytes. */
+  int dotProduct(byte[] a, byte[] b);
+
+  /** Returns the cosine similarity between the two byte vectors. */
+  float cosine(byte[] a, byte[] b);
+
+  /** Returns the sum of squared differences of the two byte vectors. */
+  int squareDistance(byte[] a, byte[] b);
+
+  // -- provider lookup mechanism
+
+  static final Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) {
+      // is locale sane (only buggy in Java 20)
+      if (runtimeVersion <= 20 && !hasWorkingDefaultLocale()) {
+        LOG.warning(
+            "Java runtime is using a buggy default locale; Java vector incubator API can't be enabled: "
+                + Locale.getDefault());
+        return new VectorUtilDefaultProvider();
+      }
+      // is the incubator module present and readable (JVM providers may to exclude them or it is
+      // build with jlink)
+      if (!vectorModulePresentAndReadable()) {
+        LOG.warning(
+            "Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.");
+        return new VectorUtilDefaultProvider();
+      }
+      if (isClientVM()) {
+        LOG.warning("C2 compiler is disabled; Java vector incubator API can't be enabled");
+        return new VectorUtilDefaultProvider();
+      }
+      try {
+        // we use method handles with lookup, so we do not need to deal with setAccessible as we
+        // have private access through the lookup:
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (UnsupportedOperationException uoe) {
+          // not supported because preferred vector size too small or similar
+          LOG.warning("Java vector incubator API was not enabled. " + uoe.getMessage());
+          return new VectorUtilDefaultProvider();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "VectorUtilPanamaProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("VectorUtilPanamaProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  private static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  private static boolean hasWorkingDefaultLocale() {
+    return Runtime.version().update() > 1

Review Comment:
   We also have a runtime version check above at the place where this is called. The check should be moved there. Especially make it ready for 21.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560857595

   So we should then hold back merging this issue and instead open bugs and wait OpenJDK to fix it. Sorry, I won't implement such spaghetti bullshit that is likely to break on alternative VMs like OpenJ9 or GraalVM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560997743

   another way to dodge this bug would be to only support minimum of 256-bit vectors for the 3 integer functions. would be a little sad for the ARM chips, but the 128-bit implementation there only gives a 2x speedup anyway.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553618970

   I really wish Math.fma fell back to sane behavior such as */+ and only StrictMath.fma did the slow big decimal stuff! Not good decisionmaking here on these apis. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553665376

   > > I'd prefer to have separate apijars, because the current code compiles with patching base module.
   > > On the other hand: it just works! 😉
   > 
   > Yeah, this is a bit of a hack!. It would be better to separate these out, but then what would we do, still patch it into java.base or build a slim module around it or something? It doesn't feel any better than just patching it all into java.base!
   
   Let's keep it as is. The whole compile is a hack, modules do not matter.
   
   With separate apijar we could also add it to classpath as the package names are not special at all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553373978

   On the other hand: it just works! 😉


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198252297


##########
lucene/core/src/java20/org/apache/lucene/util/JDKVectorUtilProvider.java:
##########
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorSpecies;
+
+public final class JDKVectorUtilProvider implements VectorUtil.VectorUtilProvider {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198253248


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##########
@@ -270,4 +216,134 @@ public static float dotProductScore(byte[] a, byte[] b) {
     float denom = (float) (a.length * (1 << 15));
     return 0.5f + dotProduct(a, b) / denom;
   }
+
+  interface VectorUtilProvider {
+
+    // just dot product for now
+    float dotProduct(float[] a, float[] b);
+  }
+
+  private static VectorUtilProvider lookupProvider() {
+    // TODO: add a check
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) { // TODO: do we want JDK 19?
+      try {
+        ensureReadability();

Review Comment:
   I refactored the code so that the presence of the module is enough to enable this. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1554440332

   I refactored the provider and impl's:
    1. So as to separate them out from VectorUtil - this should improve readability, etc, as we move beyond dotProduct.
    2. I also moved them into a it's own non-exported package.
   
   I'm less sure about no.2. The general thought was that the code might be more reusable from there, but now that I think about it, it might be better as package-private where it was, since the "interface" is through VectorUtils - not directly to the imp. Thoughts?
     


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198869676


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,13 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
+
+      // Disable assertions to workaround JDK-8301190
+      jvmArgs '-da:jdk.incubator.vector.LaneType'

Review Comment:
   Argh!   we're running into a JDK bug, fixed in the not-yet-released JDK 20.0.2! :-( The JDK bug is https://bugs.openjdk.org/browse/JDK-8301190 - it's an incorrect assertion when default locale is say tr.
   
   Reproduce with:
   `./gradlew :lucene:core:test --tests "org.apache.lucene.util.TestVectorUtil" -Dtests.locale=tr-TR`
   
   I've just disabled assertions in this one JDK class to workaround the issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556288076

   >  we were being inefficient.
   
   If I understand this correctly, the inefficiency was too many reduceLances, right? You replaced it with addition of the accumulators before reducing. Sounds good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556287294

   thanks, glad it fixes the problem. i am running it across all the sizes we test and seeing how it looks on both my machines.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] benwtrent commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "benwtrent (via GitHub)" <gi...@apache.org>.
benwtrent commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1584910093

   Thanks @uschindler for the work here! I am ignorant of the security manager and all its woes. Spent the last 2 days trying to get around this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198170915


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##########
@@ -270,4 +216,134 @@ public static float dotProductScore(byte[] a, byte[] b) {
     float denom = (float) (a.length * (1 << 15));
     return 0.5f + dotProduct(a, b) / denom;
   }
+
+  interface VectorUtilProvider {
+
+    // just dot product for now
+    float dotProduct(float[] a, float[] b);
+  }
+
+  private static VectorUtilProvider lookupProvider() {
+    // TODO: add a check
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) { // TODO: do we want JDK 19?
+      try {
+        ensureReadability();
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.JDKVectorUtilProvider");
+        // we use method handles, so we do not need to deal with setAccessible as we have private
+        // access through the lookup:
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError("JDKVectorUtilProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("JDKVectorUtilProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new LuceneVectorUtilProvider();
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static <T> T doPrivileged(PrivilegedAction<T> action) {
+    return AccessController.doPrivileged(action);
+  }
+
+  static void ensureReadability() {
+    ModuleLayer.boot().modules().stream()
+        .filter(m -> m.getName().equals("jdk.incubator.vector"))
+        .findFirst()
+        .ifPresentOrElse(
+            vecMod -> VectorUtilProvider.class.getModule().addReads(vecMod),
+            () -> LOG.warning("vector incubator module not present"));
+  }
+
+  static {
+    PROVIDER =

Review Comment:
   Ah you try to make the module readable. Then we need to document permissions and add a fallback by catching security/access exceptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199048935


##########
lucene/core/src/java/org/apache/lucene/internal/vector/DefaultVectorUtilProvider.java:
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.internal.vector;
+
+/**
+ * The default VectorUtil provider implementation.
+ *
+ * @lucene.internal
+ */
+public final class DefaultVectorUtilProvider implements VectorUtilProvider {

Review Comment:
   IMHO, this one should be package private like the Java 20 one.
   
   In general I am not really happy with the additional package? I know you are a fan of the module system, but most users still don't use it and our Javadocs also show all packages (including internal ones).
   
   Is it really needed to have the vector stuff in a separate package. For MMAPDirectory I made sure to hide everything, including the provider as it is no public API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199602226


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,10 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
-      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
+      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management,jdk.incubator.vector'

Review Comment:
   Smething like this:
   
   ```
   if (rootProject.runtimeJavaVersion == JavaVersion.VERSION_20) {
     jvmArgs '--add-modules', 'jdk.incubator.vector'
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199625873


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/**
+ * A provider of VectorUtil implementations.
+ *
+ * @lucene.internal
+ */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  // -- provider lookup mechanism
+
+  Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20 && useVectorAPI() && vectorModulePresentAndReadable()) {

Review Comment:
   if `vectorModulePresentAndReadable()` returns false we should print a warning to tell user that one should add the `--add-module` option. Initially this was there, but now it is missing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553366480

   I'd prefer to have separate apijars, because the current code compiles with patching base module.
   
   I'd like to separate this. But as a start it is ok.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199008182


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -122,7 +122,7 @@ allprojects {
       
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
-      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
+      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management,jdk.incubator.vector'

Review Comment:
   I don't think we need to explicitly test both. If the build runs with java 17 and no runtime of 20 given it uses legacy code.
   The same us true for mmapdir. There it is tested by Jenkins based on the used jfk version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198188544


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##########
@@ -270,4 +216,134 @@ public static float dotProductScore(byte[] a, byte[] b) {
     float denom = (float) (a.length * (1 << 15));
     return 0.5f + dotProduct(a, b) / denom;
   }
+
+  interface VectorUtilProvider {
+
+    // just dot product for now
+    float dotProduct(float[] a, float[] b);
+  }
+
+  private static VectorUtilProvider lookupProvider() {
+    // TODO: add a check
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) { // TODO: do we want JDK 19?
+      try {
+        ensureReadability();

Review Comment:
   Yeah, it’s probably better to condition all this on the presence of the module. Which will avoid needing a property or similar to enable/disable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1584866684

   I left a comment in ES.
   
   We could wrap the provider initialization with doPrivileged in Lucene, similar to MmapDirectory. Maybe because of this we don't run into the logging issue from MmapDirectory.
   
   But actually, the logging problem is not under control of Lucene. ES should initialize all logging early on startup, including the jul logger wrapper. There are other places logging outside do priv.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558674081

   Latest benchmark results.
   
   intel rocketlake (512-bit vectors):
   
   ```
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5  10.447 ± 0.016  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   1.059 ± 0.004  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  20.998 ± 0.012  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   3.216 ± 0.074  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5  16.426 ± 0.199  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   2.384 ± 0.018  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   8.802 ± 0.004  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   0.848 ± 0.002  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  25.904 ± 0.035  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   3.479 ± 0.010  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  18.968 ± 0.190  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   2.575 ± 0.007  ops/us
   ```
   
   mac m1 arm (128-bit vectors):   ( for sanity with Robert's results )
   
   ```
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   2.188 ± 0.027  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   1.035 ± 0.021  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   6.086 ± 0.019  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   3.041 ± 0.149  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   5.912 ± 0.023  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   3.061 ± 0.085  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   7.420 ± 0.472  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   1.052 ± 0.037  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  15.489 ± 0.469  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   3.643 ± 0.428  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  13.714 ± 1.015  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   3.074 ± 0.118  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556224675

   i made the benchmarks easier to run with something like this:
   ```
   git clone https://github.com/rmuir/vectorbench
   cd vectorbench
   mvn verify
   java -jar target/vectorbench.jar
   ```
   
   I can confirm everything works on aarch64 and i am experimenting with the unrolling. definitely if we remove the unrolling it gets way slower.
   
   so i tried to unroll again (4x instead of 2x), it is only a slight improvement in performance on my skylake:
   ```
   Benchmark                             (size)   Mode  Cnt   Score   Error   Units
   DotProductBenchmark.dotProductNew       1024  thrpt    5   9.997 ± 0.999  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5  11.285 ± 0.161  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5   2.024 ± 0.028  ops/us
   ```
   
   But on the aarch64 mac additionall unrolling basically doubles throughput again (7.785 -> 14.912). I am experimenting more with this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556278132

   I didn't get an anywhere with Luceneutil yet! :-(   (I haven't been able to run it successfully, getting OOM errors )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557153092

   Hi,
   
   > With 256 bit vectors it is fast using ByteVector.SPECIES_64, ShortVector.SPECIES_128, and IntVector.SPECIES_256 But for ARM which only has 128-bit vectors, the generic code using only "SPECIES_PREFERRED" isn't as fast as it should be: almost 2x but not 4x like on avx-256.
   
   Maybe because it needs twice as much iterations all creating tons of instances on heap (until escape analysis kicks in?).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1563317504

   I'm seeing very strange results after testing with 768-dim vectors. 
   
   ```
                               TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value                                                         
                     HighTermVector      139.27      (6.7%)       57.21      (0.8%)  -58.9% ( -62% -  -55%) 0.000                                                            
                  AndHighHighVector      139.15      (6.4%)       57.15      (0.7%)  -58.9% ( -62% -  -55%) 0.000                                                            
                      LowTermVector      138.54      (6.2%)       57.00      (0.8%)  -58.9% ( -62% -  -55%) 0.000                                                            
                      MedTermVector      138.67      (6.9%)       57.23      (0.9%)  -58.7% ( -62% -  -54%) 0.000                                                            
                   AndHighMedVector      137.95      (6.1%)       57.10      (0.8%)  -58.6% ( -61% -  -55%) 0.000                                                            
                   AndHighLowVector      137.86      (6.4%)       57.21      (0.8%)  -58.5% ( -61% -  -54%) 0.000                                                            
                           PKLookup      199.30      (2.3%)      198.44      (2.5%)   -0.4% (  -5% -    4%) 0.565  
   ```
   I have double-checked that candidate has  609fc9b63f61954a7408faa1669e807a6bbf1da9 and baseline is c9c49bc5539d83979360e65b39a536c2d452ba2a. Both conditions are run using the same index, tasks, etc. The JFR output is sort of a head-scratcher too. They clearly show that the JDK vector API is being called (I think?) but also show a lot of noisy things (Objects.requireNonNull). Is ` jdk.incubator.vector.FloatVector#reduceLanesTemplate()` really what we want to see there? Another mystery is why `org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel()  ` would have more samples in the candidate given that it has nothing to do with the vector api.  In looking at these you should ignore things related to `VectorDictionary` - these are loading a word->vector dictionary that is used to look up query vectors; the loading takes a while but happens prior to the index being opened and the  queries being executed.
   
   ## candidate
   
   ```
   PERCENT       CPU SAMPLES   STACK                                                                                                                                             
   9.00%         2721          jdk.incubator.vector.FloatVector#reduceLanesTemplate()                                                                                            
   5.44%         1645          org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel()                                                                                       
   3.89%         1177          org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()                                                                             
   3.03%         917           org.apache.lucene.util.packed.DirectReader$DirectPackedReader16#get()                                                                             
   2.99%         904           java.lang.invoke.VarHandleGuards#guard_LJ_I()                                                                                                     
   2.70%         816           jdk.internal.misc.Unsafe#copyMemoryChecks()                                                                                                       
   2.52%         761           perf.VectorDictionary#vectorDiv()                                                                                                                 
   2.45%         741           org.apache.lucene.store.DataInput#readVInt()                                                                                                      
   2.43%         734           org.apache.lucene.util.packed.DirectMonotonicReader#get()                                                                                         
   2.17%         655           java.util.Objects#requireNonNull()                                                                                                                
   1.92%         580           jdk.jfr.internal.JVM#emitEvent()                                                                                                                  
   1.70%         513           java.util.Arrays#binarySearch0()                                                                                                                  
   1.40%         423           org.apache.lucene.codecs.lucene95.Lucene95HnswVectorsReader$OffHeapHnswGraph#seek()                                                               
   1.35%         407           java.util.HashMap#resize()                                                                                                                        
   1.29%         391           org.apache.lucene.codecs.lucene95.OffHeapFloatVectorValues#vectorValue()                                                                          
   1.29%         389           sun.nio.ch.UnixFileDispatcherImpl#unmap0()                                                                                                        
   1.27%         385           org.apache.lucene.util.VectorUtilPanamaProvider#dotProduct()                                                                                      
   1.26%         380           jdk.internal.foreign.AbstractMemorySegmentImpl#checkBounds()                                                                                      
   1.23%         373           org.apache.lucene.util.LongHeap#downHeap()                                                                                                        
   1.22%         370           jdk.incubator.vector.FloatVector#fromArray0Template()                                                                                             
   1.19%         361           java.util.zip.Inflater#inflateBytesBytes()                                                                                                        
   1.15%         348           org.apache.lucene.util.hnsw.NeighborQueue#decodeScore()                                                                                           
   1.13%         341           org.apache.lucene.util.SparseFixedBitSet#getAndSet()                                                                                              
   1.12%         338           org.apache.lucene.util.SparseFixedBitSet#insertLong()                                                                                             
   1.02%         308           jdk.internal.misc.Unsafe#copyMemory()                                                                                                             
   0.96%         289           jdk.internal.foreign.AbstractMemorySegmentImpl#copy()                                                                                             
   0.95%         286           perf.VectorDictionary#<init>()                                                                                                                    
   0.94%         283           java.util.Objects#checkIndex()                                                                                                                    
   0.88%         267           jdk.internal.foreign.AbstractMemorySegmentImpl#getBaseAndScale()                                                                                  
   0.85%         257           org.apache.lucene.store.MemorySegmentIndexInput#readByte()
   ```
   
   ## baseline
   
   ```
   PERCENT       CPU SAMPLES   STACK
   24.69%        7386          perf.VectorDictionary#vectorDiv()
   20.49%        6127          org.apache.lucene.util.VectorUtil#dotProduct()
   3.42%         1022          java.nio.FloatBuffer#getArray()
   2.65%         792           org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel()
   2.50%         748           perf.VectorDictionary#<init>()
   2.34%         699           jdk.internal.misc.Unsafe#checkPrimitivePointer()
   1.95%         583           jdk.jfr.internal.JVM#emitEvent()
   1.62%         485           java.util.HashMap#getNode()
   1.61%         481           org.apache.lucene.util.SparseFixedBitSet#insertLong()
   1.50%         450           java.nio.Buffer#position()
   1.32%         395           java.util.HashMap#resize()
   1.03%         309           java.io.BufferedReader#readLine()
   0.98%         292           jdk.internal.misc.Unsafe#checkOffset()
   0.91%         272           perf.PKLookupTask#go()
   0.90%         270           java.util.zip.Inflater#inflateBytesBytes()
   0.85%         253           java.util.HashMap#containsKey()
   0.79%         236           org.apache.lucene.util.LongHeap#downHeap()
   0.67%         200           org.apache.lucene.codecs.lucene90.blocktree.SegmentTermsEnum#seekExact()
   0.65%         194           sun.nio.ch.FileChannelImpl#unmap0()
   0.57%         170           org.apache.lucene.util.SparseFixedBitSet#getAndSet()
   0.53%         159           jdk.internal.util.ArraysSupport#mismatch()
   0.51%         154           org.apache.lucene.util.hnsw.HnswGraphSearcher#graphNextNeighbor()
   0.48%         145           sun.nio.cs.UTF_8$Decoder#decodeArrayLoop()
   0.48%         143           java.io.FileOutputStream#writeBytes()
   0.46%         137           org.apache.lucene.store.ByteBufferGuard#getByte()
   0.44%         131           org.apache.lucene.codecs.lucene95.Lucene95HnswVectorsReader$OffHeapHnswGraph#seek()
   0.41%         123           java.nio.Buffer#checkIndex()
   0.40%         120           org.apache.lucene.store.DataInput#readVInt()
   0.38%         113           perf.VectorDictionary#vectorNorm()
   0.35%         104           org.apache.lucene.util.LongHeap#upHeap()
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199608565


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,10 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
-      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
+      jvmArgs '--add-modules', 'jdk.unsupported,jdk.management,jdk.incubator.vector'

Review Comment:
   Fixed this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555915665

   Thanks for re-benchmarking @ChrisHegarty ! It has been a few years and an older JDK version since this stuff was developed. I had in mind to do a couple more runs eventually:
   * try it on aarch64 mac, i think it has no SVE but at least has NEON. at least make sure perf isn't horrible or something.
   * see if the manual loop unrolling is still necessary. it is a bit ugly/surprising we have to do this for performance.
   
   it is great you have AVX-512, I don't have it. I was worried about using `SPECIES_PREFERRED` and clock throttling, but I think lucene doesn't need to do anything about this? A user can pass `UseAVX=2` to the JVM if they want to avoid AVX-512 for that reason... i think


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561414222

   Should we now start to prepare this for inclusion in main and 9.x?
   
   With Java 21 MMAP I could then also start to extract apijars (it needs a newer asm version, but thats done already).
   
   We should mybe think again first about the API design:
   - for now we should keep the internals private (thats done here, the javadocs do not offer any new public classes; VectorUtil looks as before)
   - Are we open to extends this away from VectorUtil? Of course this only works for code that can be implemented isolated from the Lucene implementations (an isolated implementation of an algorithm). But would it be possible to also call from postings into this code, like packed readers instances or similar. If we can't do this, is it possible to have a common interface and let VectorUtilProvider return an instance (the conventional one vs. Panama one) of PackedReader implementations? In that case we should move the vectorizable classes and hide them from users and add factories in VectorUtil (like VectorUtil#getPackedXYReader) to allow to request an implementation of an algorithms.
   
   At moment I see no chance to vectorize stuff that relies on the common Lucene iterator pattern. But all stuff that works block based could be moved.
   
   If we are fine with the current approach, we could merge this to main 9.x and then proceed to move more stuff into VectorUtil.
   
   As all APIs are hidden at moment, there's no risk in merging this. We can decide to change the VectorUtilProvider and rename/refactor it, as it is private.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561865838

   I marked it "read for review". A changes entry is still missing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1562390348

   Hey yeah was fun. I added one commit, 2 minutes too late. I sent to main branch. Was some change to the recursive extraction loop to not add a set to itsself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty merged pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty merged PR #12311:
URL: https://github.com/apache/lucene/pull/12311


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201931143


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   We would need to spawn a separate VM from inside the main TestVectorUtil test (we have other tests that explicitely crush the JVM to check if all recovers correctly).
   A native way so we can run a single test with different options by randomized runner without refactoring a major amount of the build isn't so easy. It is doable by adding another Gradle "testVectors" task which is hooeked as dependency of "test", so runs separately.
   
   We can easily compare the outputs of both implementations by instantiating the package-private Provider classes from a test and compare results. An easy check to do this is: Get the autoloaded PROVIDER (make it package private) and do `assumeFalse(VectprUtils.PROVIDER instanceof DefaultVectorProvider)`. So it will not execute the test if the provider used is the default provider. Otherwise instantiate a DefaulutLuceneVectorProvider and then compare results.
   
   For bytes it works easy, for floats we would need a large enough epsilon when comparing the resulting floats. Or do I miss something here. There are differences in results, but Assert.assertEquals() is available for floats with an epsilon.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559562832

   > you can see this stuff in benchmarks above where e.g. 128 dims is faster than 100, etc.
   
   should we zero-pad before computing the dot-products? It wouldn't affect the result and sounds like it would be faster


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559669255

   > @uschindler
   > 
   > > Cool thanks. I was about to do the same, my idea was a bit different: Add a normal virtual method "isSupported()" to the interface and implement it returning true for default provider but returning something depending on vector size for panama provider. This would spare two times doing the additional reflection using the lookup. The lookup function would only return the panama provider if it returns true.
   > > Another approach is to use Lookup#findStaticVarHandle() on `INT_SPECIES_PREF_BIT_SIZE` and read it. This spares catching `Throwable`, which is one of the things I hate about method handles.
   > 
   > I used a static method to avoid creating the instance if it's never going to be used, feel free to rewrite this. I don't have a strong option on how this is done - just that it is done. :-)
   > 
   > Also, not a warning, but should we log the Vector bit size? TRACE or INFO or something ?
   
   I changed the code to use a varhandle to directly read the static field, this spares all try/catch with throwable. I also added a LOG.fine().
   
   In MMapDirectory we use LOG.info(), maybe we should do the same here. If somebody enabled the vector API heshe should get some feedback about success.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559712639

   Sorry for heavy committing, I adapted the code to how MMapDirectory works


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201940187


##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,455 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(float[] a, float[] b) {
+    int i = 0;
+    float sum = 0;
+    float norm1 = 0;
+    float norm2 = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector sum1 = FloatVector.zero(SPECIES);
+      FloatVector sum2 = FloatVector.zero(SPECIES);
+      FloatVector sum3 = FloatVector.zero(SPECIES);
+      FloatVector sum4 = FloatVector.zero(SPECIES);
+      FloatVector norm1_1 = FloatVector.zero(SPECIES);
+      FloatVector norm1_2 = FloatVector.zero(SPECIES);
+      FloatVector norm1_3 = FloatVector.zero(SPECIES);
+      FloatVector norm1_4 = FloatVector.zero(SPECIES);
+      FloatVector norm2_1 = FloatVector.zero(SPECIES);
+      FloatVector norm2_2 = FloatVector.zero(SPECIES);
+      FloatVector norm2_3 = FloatVector.zero(SPECIES);
+      FloatVector norm2_4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        sum2 = sum2.add(vc.mul(vd));
+        norm1_2 = norm1_2.add(vc.mul(vc));
+        norm2_2 = norm2_2.add(vd.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        sum3 = sum3.add(ve.mul(vf));
+        norm1_3 = norm1_3.add(ve.mul(ve));
+        norm2_3 = norm2_3.add(vf.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        sum4 = sum4.add(vg.mul(vh));
+        norm1_4 = norm1_4.add(vg.mul(vg));
+        norm2_4 = norm2_4.add(vh.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+      }
+      // reduce
+      FloatVector sumres1 = sum1.add(sum2);
+      FloatVector sumres2 = sum3.add(sum4);
+      FloatVector norm1res1 = norm1_1.add(norm1_2);
+      FloatVector norm1res2 = norm1_3.add(norm1_4);
+      FloatVector norm2res1 = norm2_1.add(norm2_2);
+      FloatVector norm2res2 = norm2_3.add(norm2_4);
+      sum += sumres1.add(sumres2).reduceLanes(VectorOperators.ADD);
+      norm1 += norm1res1.add(norm1res2).reduceLanes(VectorOperators.ADD);
+      norm2 += norm2res1.add(norm2res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float elem1 = a[i];
+      float elem2 = b[i];
+      sum += elem1 * elem2;
+      norm1 += elem1 * elem1;
+      norm2 += elem2 * elem2;
+    }
+    return (float) (sum / Math.sqrt(norm1 * norm2));
+  }
+
+  @Override
+  public float squareDistance(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff1 = va.sub(vb);
+        acc1 = acc1.add(diff1.mul(diff1));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        FloatVector diff2 = vc.sub(vd);
+        acc2 = acc2.add(diff2.mul(diff2));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        FloatVector diff3 = ve.sub(vf);
+        acc3 = acc3.add(diff3.mul(diff3));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        FloatVector diff4 = vg.sub(vh);
+        acc4 = acc4.add(diff4.mul(diff4));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff = va.sub(vb);
+        acc1 = acc1.add(diff.mul(diff));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float diff = a[i] - b[i];
+      res += diff * diff;
+    }
+    return res;
+  }
+
+  // Binary functions, these all follow a general pattern like this:
+  //
+  //   short intermediate = a * b;
+  //   int accumulator = accumulator + intermediate;
+  //
+  // 256 or 512 bit vectors can process 64 or 128 bits at a time, respectively
+  // intermediate results use 128 or 256 bit vectors, respectively
+  // final accumulator uses 256 or 512 bit vectors, respectively
+  //
+  // We also support 128 bit vectors, using two 128 bit accumulators.
+  // This is slower but still faster than not vectorizing at all.
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  static final int INT_SPECIES_PREFERRED_BIT_SIZE = IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // compute vectorized dot product consistent with VPDPBUSD instruction, acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short product = (short) (x[i] * y[i]);
+      //   sum += product;
+      // }
+      if (INT_SPECIES_PREFERRED_BIT_SIZE >= 256) {
+        // optimized 256/512 bit implementation, processes 8/16 bytes at a time
+        int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length);
+        IntVector acc = IntVector.zero(IntVector.SPECIES_PREFERRED);
+        for (; i < upperBound; i += PREFERRED_BYTE_SPECIES.length()) {
+          ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i);
+          ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i);
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          Vector<Integer> prod32 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_PREFERRED, 0);
+          acc = acc.add(prod32);
+        }
+        // reduce
+        res += acc.reduceLanes(VectorOperators.ADD);
+      } else {
+        // 128-bit implementation, which must "split up" vectors due to widening conversions
+        int upperBound = ByteVector.SPECIES_64.loopBound(a.length);
+        IntVector acc1 = IntVector.zero(IntVector.SPECIES_128);
+        IntVector acc2 = IntVector.zero(IntVector.SPECIES_128);
+        for (; i < upperBound; i += ByteVector.SPECIES_64.length()) {
+          ByteVector va8 = ByteVector.fromArray(ByteVector.SPECIES_64, a, i);
+          ByteVector vb8 = ByteVector.fromArray(ByteVector.SPECIES_64, b, i);
+          // expand each byte vector into short vector and multiply
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          // split each short vector into two int vectors and add
+          Vector<Integer> prod32_1 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 0);
+          Vector<Integer> prod32_2 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 1);
+          acc1 = acc1.add(prod32_1);
+          acc2 = acc2.add(prod32_2);
+        }
+        // reduce
+        res += acc1.add(acc2).reduceLanes(VectorOperators.ADD);
+      }
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(byte[] a, byte[] b) {
+    int i = 0;
+    int sum = 0;
+    int norm1 = 0;
+    int norm2 = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short difference = (short) (x[i] - y[i]);
+      //   sum += (int) difference * (int) difference;
+      // }

Review Comment:
   Done.



##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,455 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(float[] a, float[] b) {
+    int i = 0;
+    float sum = 0;
+    float norm1 = 0;
+    float norm2 = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector sum1 = FloatVector.zero(SPECIES);
+      FloatVector sum2 = FloatVector.zero(SPECIES);
+      FloatVector sum3 = FloatVector.zero(SPECIES);
+      FloatVector sum4 = FloatVector.zero(SPECIES);
+      FloatVector norm1_1 = FloatVector.zero(SPECIES);
+      FloatVector norm1_2 = FloatVector.zero(SPECIES);
+      FloatVector norm1_3 = FloatVector.zero(SPECIES);
+      FloatVector norm1_4 = FloatVector.zero(SPECIES);
+      FloatVector norm2_1 = FloatVector.zero(SPECIES);
+      FloatVector norm2_2 = FloatVector.zero(SPECIES);
+      FloatVector norm2_3 = FloatVector.zero(SPECIES);
+      FloatVector norm2_4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        sum2 = sum2.add(vc.mul(vd));
+        norm1_2 = norm1_2.add(vc.mul(vc));
+        norm2_2 = norm2_2.add(vd.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        sum3 = sum3.add(ve.mul(vf));
+        norm1_3 = norm1_3.add(ve.mul(ve));
+        norm2_3 = norm2_3.add(vf.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        sum4 = sum4.add(vg.mul(vh));
+        norm1_4 = norm1_4.add(vg.mul(vg));
+        norm2_4 = norm2_4.add(vh.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+      }
+      // reduce
+      FloatVector sumres1 = sum1.add(sum2);
+      FloatVector sumres2 = sum3.add(sum4);
+      FloatVector norm1res1 = norm1_1.add(norm1_2);
+      FloatVector norm1res2 = norm1_3.add(norm1_4);
+      FloatVector norm2res1 = norm2_1.add(norm2_2);
+      FloatVector norm2res2 = norm2_3.add(norm2_4);
+      sum += sumres1.add(sumres2).reduceLanes(VectorOperators.ADD);
+      norm1 += norm1res1.add(norm1res2).reduceLanes(VectorOperators.ADD);
+      norm2 += norm2res1.add(norm2res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float elem1 = a[i];
+      float elem2 = b[i];
+      sum += elem1 * elem2;
+      norm1 += elem1 * elem1;
+      norm2 += elem2 * elem2;
+    }
+    return (float) (sum / Math.sqrt(norm1 * norm2));
+  }
+
+  @Override
+  public float squareDistance(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff1 = va.sub(vb);
+        acc1 = acc1.add(diff1.mul(diff1));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        FloatVector diff2 = vc.sub(vd);
+        acc2 = acc2.add(diff2.mul(diff2));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        FloatVector diff3 = ve.sub(vf);
+        acc3 = acc3.add(diff3.mul(diff3));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        FloatVector diff4 = vg.sub(vh);
+        acc4 = acc4.add(diff4.mul(diff4));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff = va.sub(vb);
+        acc1 = acc1.add(diff.mul(diff));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float diff = a[i] - b[i];
+      res += diff * diff;
+    }
+    return res;
+  }
+
+  // Binary functions, these all follow a general pattern like this:
+  //
+  //   short intermediate = a * b;
+  //   int accumulator = accumulator + intermediate;
+  //
+  // 256 or 512 bit vectors can process 64 or 128 bits at a time, respectively
+  // intermediate results use 128 or 256 bit vectors, respectively
+  // final accumulator uses 256 or 512 bit vectors, respectively
+  //
+  // We also support 128 bit vectors, using two 128 bit accumulators.
+  // This is slower but still faster than not vectorizing at all.
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  static final int INT_SPECIES_PREFERRED_BIT_SIZE = IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // compute vectorized dot product consistent with VPDPBUSD instruction, acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short product = (short) (x[i] * y[i]);
+      //   sum += product;
+      // }
+      if (INT_SPECIES_PREFERRED_BIT_SIZE >= 256) {
+        // optimized 256/512 bit implementation, processes 8/16 bytes at a time
+        int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length);
+        IntVector acc = IntVector.zero(IntVector.SPECIES_PREFERRED);
+        for (; i < upperBound; i += PREFERRED_BYTE_SPECIES.length()) {
+          ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i);
+          ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i);
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          Vector<Integer> prod32 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_PREFERRED, 0);
+          acc = acc.add(prod32);
+        }
+        // reduce
+        res += acc.reduceLanes(VectorOperators.ADD);
+      } else {
+        // 128-bit implementation, which must "split up" vectors due to widening conversions
+        int upperBound = ByteVector.SPECIES_64.loopBound(a.length);
+        IntVector acc1 = IntVector.zero(IntVector.SPECIES_128);
+        IntVector acc2 = IntVector.zero(IntVector.SPECIES_128);
+        for (; i < upperBound; i += ByteVector.SPECIES_64.length()) {
+          ByteVector va8 = ByteVector.fromArray(ByteVector.SPECIES_64, a, i);
+          ByteVector vb8 = ByteVector.fromArray(ByteVector.SPECIES_64, b, i);
+          // expand each byte vector into short vector and multiply
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          // split each short vector into two int vectors and add
+          Vector<Integer> prod32_1 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 0);
+          Vector<Integer> prod32_2 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 1);
+          acc1 = acc1.add(prod32_1);
+          acc2 = acc2.add(prod32_2);
+        }
+        // reduce
+        res += acc1.add(acc2).reduceLanes(VectorOperators.ADD);
+      }
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(byte[] a, byte[] b) {
+    int i = 0;
+    int sum = 0;
+    int norm1 = 0;
+    int norm2 = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short difference = (short) (x[i] - y[i]);
+      //   sum += (int) difference * (int) difference;
+      // }

Review Comment:
   Done. [11e6634](https://github.com/apache/lucene/pull/12311/commits/11e66348adc0304a5df996bfde8253acdbd15806)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560166430

   You could use the RAMUsageEstimator code to look into VM flags.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "dweiss (via GitHub)" <gi...@apache.org>.
dweiss commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1575681452

   It's the same situation with Math.fma. The inability to tell whether it's accelerated or not makes the think unusable - the fallback is *so* bad that the risk is rarely worth taking...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1584835082

   @rmuir @uschindler I just want to draw your attention to a potential issue that may arise from this change - there is a property permission check leaking out from the JDK - https://bugs.openjdk.org/browse/JDK-8309727
   
   It may be that users of Lucene need to grant the necessary property read permission. Some more details in the Elasticsearch issue https://github.com/elastic/elasticsearch/pull/96715
   
   Note: ES has two potential issues, the aforementioned property read, and also a problem with logging. The logging is ES specific.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561229656

   > See commit above. Hopefully it is ok?
   
   It's a low impact on the code that avoids landmines - seems reasonable. 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564492407

   hm I looked more closely at the test I ran and it seems I managed to create a file full of identical vectors -- so this is going to lead to crazy results. WIll follow up once I've managed to fix the vector creation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560610819

   Hi,
   
   > As a potential workaround I tried Uwe's suggestion to look at RAMUsageEstimator code and ran it inside this VM, it works and seems to detect the situation?
   > 
   > ```
   > jshell> final Object vmOption = getVMOptionMethod.invoke(hotSpotBean, "UseAVX");
   > vmOption ==> VM option: UseAVX value: 0  origin: DEFAULT (read-only)
   > ```
   
   We *could* do this, but on the other hand I think this is going to get spaghetti code. Especially as UseAVX is only valid for x86 CPUs, ARM doesn't has this.
   
   This is a problem of OpenJDK and we should report this what we have observed:
   - Very slow performance of the "default/fallback" impl. like 30x slower (do we have exact easurements)
   - No way to detect if vector code is correctly optimized
   
   We should make it clar that we need some way to figure out consistently at runtime if we chose our own code or the vector code based on a simple information like "works/does not work". The PREFFERED constants in Species should always contain correct values, also when Hotspot is not enabled. This means for Integers it should return 32 (plain simple), as this is the maximum vector bit size. It is a bug to return some arbitrary defaults for preferred sizes if AVX is disabled or C2 disabled.
   
   About the current Lucene impl: The user has to opt in by passing `--enable-modules jdk.incubator.vector`. If the user is doing this he should be prepared that it might go wrong. We should not do too much detection logic. If somebody figures out it gets slow like hell, they can remove that option. That's all what incubators are about!
   
   When this is fixed in later versions (e.g., during preview) we should be able to detect the lane sizes correctly and all works fine. I disagree to work around bugs in incubation features of OpenJDK, sorry.
   
   The current hack with the system property would in reality also not be needed, it is mainly there to support people running tests, so I would keep it. Butit is not our responsibility to detect hardware features in Lucene code, it should work out of box.
   
   In the release notes with the PR merged we should clearly state that people have to opt in and they should test, and report back if it works well for them. We should clearly say that in some ancient hardware combinations or misconfigured VM environments it may cause havoc to enable it (slowness).
   
   So please lets stop here and not implement hardware detection for opt-in only preview images!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560614963

   > This is a problem of OpenJDK and we should report this what we have observed:
   
   It seems very likely that this is an JDK bug. Lemme do a little digging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561030863

   at the end of the day, there's definitely a bigger design issue, unrelated to vector length, in that there are landmines everywhere with regards to the various functions. If you use the wrong one on the wrong cpu, performance can get like 50x slower (terrible trap). It would be better to just use your own scalar function.
   
   We try to only use simple operations here to avoid problems, but the general issue remains.
   
   And there's no way for you to cleanly check "is this supported" (e.g. like checking CPUID or similar). So I don't understand how libraries are supposed to use this thing :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561713446

   i am ok with whatever renames help us to maintain this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561982837

   @ChrisHegarty Go ahead and merge / backport to 9.x.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1204270565


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   I think this is solved. All is fast also for devlopers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1562413715

   Merge for PR #12294 was quite simple, it was done mostly automatic. I only had to add the new version number to extractor. I did not add the vector code yet; we can do it in this PR or separately (just some coordination needed).
   
   If there are no code changes for the vector code needed in 21, I will copy over the 20 classes to the 21 folder.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] alessandrobenedetti commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "alessandrobenedetti (via GitHub)" <gi...@apache.org>.
alessandrobenedetti commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564190226

   Just out of curiosity, do we tolerate this sort of class in Lucene?
   Are some of them auto-generated? (for example lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java)
   What's the standard approach in these scenarios?
   
   Not a polemic, I am genuinely curious because they seem far from being maintainable, but I guess they are useful as they bring low level implementations goodies?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1554866016

   ... I had to first the reason why my computer produced a different APIJAR from beginning.... I HATE DEFAULT TIMEZONE, DIE; DIE; DIE


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1553298071

   Thanks @rmuir - I just merged in your implementation. I think that it's a much much better starting (if not the final) place. This might be a reasonable minimal point to start from.
   
   Before digging too deeply into the performance and optimising the code, I guess I just want to understand if this is the right level to be plugging in at. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557149306

   > I don't have perf numbers any more - no idea whether this is better than what you have already - probably not, but it might be worth trying castShape?
   
   I'm using convertShape which is the same thing, only it allows a little more flexibility as instead of casting by Class, you can operators (e.g. zero extension). FYI Your code looks to have a correctness issue as it accumulates into `short` which is unsafe.
   
   See code: https://github.com/apache/lucene/pull/12311/commits/3a6cb81d092c240a7dc3938646186a9bfa021900
   Again my question is just how to do it generically with good performance, especially for machines with only 128 bit vectors :)
   
   With 256 bit vectors it is fast using ByteVector.SPECIES_64, ShortVector.SPECIES_128, and IntVector.SPECIES_256
   But for ARM which only has 128-bit vectors, the generic code using only "SPECIES_PREFERRED" isn't as fast as it should be: almost 2x but not 4x like on avx-256.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557577587

   @rmuir for the byte[] case, it seems to me that we want to size things so as to optimise for the ShortVector preferred species, right? which is what you seem to have done for a number of specific sizes - which I think it good. You did ask if we could generalise this.
   
   Based on the structure your latest commit, can we not just shape the stride of the ByteVector based on the preferred ShortVector, e.g.
   
   ```
   static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES = ByteVector.SPECIES_MAX.withShape(VectorShape.forBitSize(ShortVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
   
     @Benchmark
     public int dotProductNewNew() {
       int i = 0;
       int res = 0;
       // only vectorize if we'll at least enter the loop a single time
       if (a.length >= ByteVector.SPECIES_64.length()) {
         // optimized 256 bit implementation, processes 8 bytes at a time
         int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length);
         IntVector acc1 = IntVector.zero(IntVector.SPECIES_PREFERRED);
         IntVector acc2 = IntVector.zero(IntVector.SPECIES_PREFERRED);
         for (; i < upperBound; i += PREFERRED_BYTE_SPECIES.length()) {
             ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i);
             ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i);
             Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_PREFERRED, 0);
             Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_PREFERRED, 0);
             Vector<Short> prod16 = va16.mul(vb16);
             Vector<Integer> prod32_1 = prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_PREFERRED, 0);
             Vector<Integer> prod32_2 = prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_PREFERRED, 1);
             acc1 = acc1.add(prod32_1);
             acc2 = acc2.add(prod32_2);
         }
         // reduce
         res += acc1.add(acc2).reduceLanes(VectorOperators.ADD);
       }
   
       for (; i < a.length; i++) {
         res += b[i] * a[i];
       }
       return res;
     }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557631563

   Seems to take quite a hit on my 256. And I suspect if you tried to make a "512 version" of the existing code it might be much better too? ByteVector.SPECIES_128 -> ShortVector.SPECIES_256 -> IntVector.SPECIES_512. No need to splitting into "parts".
   
   I'm also concerned it will error out if the user e.g. has only 64-bit vectors as the only possible size (e.g. avx disabled or vectorization not supported for the architecture).
   But maybe we can play with some of the idea more.
   
   ```
   Benchmark                                   (size)   Mode  Cnt  Score   Error   Units
   BinaryDotProductBenchmark.dotProductNew       1024  thrpt    5  7.174 ± 0.602  ops/us
   BinaryDotProductBenchmark.dotProductNewNew    1024  thrpt    5  5.559 ± 0.109  ops/us
   BinaryDotProductBenchmark.dotProductOld       1024  thrpt    5  1.868 ± 0.019  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201067163


##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    final int vectorSize = IntVector.SPECIES_PREFERRED.vectorBitSize();
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && vectorSize >= 128) {
+      // compute vectorized dot product consistent with VPDPBUSD instruction, acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short product = (short) (x[i] * y[i]);
+      //   sum += product;
+      // }
+      if (vectorSize >= 256) {

Review Comment:
   I would instead check `PREFERRED_BYTE_SPECIES != null`. I needed to read code 2 times to figure out why it is valid that `PREFERRED_BYTE_SPECIES` may be initialized to null in the static initializer above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201114747


##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    final int vectorSize = IntVector.SPECIES_PREFERRED.vectorBitSize();
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && vectorSize >= 128) {
+      // compute vectorized dot product consistent with VPDPBUSD instruction, acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short product = (short) (x[i] * y[i]);
+      //   sum += product;
+      // }
+      if (vectorSize >= 256) {

Review Comment:
   yeah. we should also rename constant. as well as many other variables in this messy code i am pushing. needs heavy cleanup. gimme some time, i want to get all 6 of the necessary methods (3 float[], 3 byte[]) in here and then clean up everything. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198189573


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##########
@@ -270,4 +216,134 @@ public static float dotProductScore(byte[] a, byte[] b) {
     float denom = (float) (a.length * (1 << 15));
     return 0.5f + dotProduct(a, b) / denom;
   }
+
+  interface VectorUtilProvider {
+
+    // just dot product for now
+    float dotProduct(float[] a, float[] b);
+  }
+
+  private static VectorUtilProvider lookupProvider() {
+    // TODO: add a check
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20) { // TODO: do we want JDK 19?
+      try {
+        ensureReadability();
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.JDKVectorUtilProvider");
+        // we use method handles, so we do not need to deal with setAccessible as we have private
+        // access through the lookup:
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError("JDKVectorUtilProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("JDKVectorUtilProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new LuceneVectorUtilProvider();
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static <T> T doPrivileged(PrivilegedAction<T> action) {
+    return AccessController.doPrivileged(action);
+  }
+
+  static void ensureReadability() {
+    ModuleLayer.boot().modules().stream()
+        .filter(m -> m.getName().equals("jdk.incubator.vector"))
+        .findFirst()
+        .ifPresentOrElse(
+            vecMod -> VectorUtilProvider.class.getModule().addReads(vecMod),
+            () -> LOG.warning("vector incubator module not present"));
+  }
+
+  static {
+    PROVIDER =

Review Comment:
   I’ll refactor to minimise the privileged block.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199626630


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/**
+ * A provider of VectorUtil implementations.
+ *
+ * @lucene.internal
+ */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  // -- provider lookup mechanism
+
+  Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20 && useVectorAPI() && vectorModulePresentAndReadable()) {
+      try {
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        // we use method handles, so we do not need to deal with setAccessible as we have private
+        // access through the lookup:
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "PanamaVectorUtilProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("PanamaVectorUtilProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  static boolean useVectorAPI() {
+    return Objects.equals("I", "i".toUpperCase(Locale.getDefault()));

Review Comment:
   Maybe log a warning here, too, so user can set a different default locale.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555945898

   I fixed the warnings to give correct instructions if incubator module is missing or the default locale f*cks up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556284488

   thanks for sanity checking! i'm still working on the repo and making improvements. would be super-curious if you could 'git pull' and try -psize=1024 on your avx512 machine. hopefully it looks better now there, we were being inefficient.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556297561

   skylake:
   ```
   Benchmark                             (size)   Mode  Cnt    Score   Error   Units
   DotProductBenchmark.dotProductNew          1  thrpt    5  153.702 ± 2.576  ops/us
   DotProductBenchmark.dotProductNew          4  thrpt    5   95.861 ± 1.403  ops/us
   DotProductBenchmark.dotProductNew          6  thrpt    5   93.582 ± 1.640  ops/us
   DotProductBenchmark.dotProductNew          8  thrpt    5   81.923 ± 1.045  ops/us
   DotProductBenchmark.dotProductNew         13  thrpt    5   66.178 ± 0.789  ops/us
   DotProductBenchmark.dotProductNew         16  thrpt    5   62.173 ± 1.191  ops/us
   DotProductBenchmark.dotProductNew         25  thrpt    5   40.726 ± 0.455  ops/us
   DotProductBenchmark.dotProductNew         32  thrpt    5   59.063 ± 6.797  ops/us
   DotProductBenchmark.dotProductNew         64  thrpt    5   51.108 ± 1.368  ops/us
   DotProductBenchmark.dotProductNew        100  thrpt    5   35.460 ± 0.310  ops/us
   DotProductBenchmark.dotProductNew        128  thrpt    5   39.522 ± 0.356  ops/us
   DotProductBenchmark.dotProductNew        207  thrpt    5   21.369 ± 0.182  ops/us
   DotProductBenchmark.dotProductNew        256  thrpt    5   26.010 ± 0.112  ops/us
   DotProductBenchmark.dotProductNew        300  thrpt    5   19.118 ± 0.389  ops/us
   DotProductBenchmark.dotProductNew        512  thrpt    5   17.368 ± 0.755  ops/us
   DotProductBenchmark.dotProductNew        702  thrpt    5   11.338 ± 0.143  ops/us
   DotProductBenchmark.dotProductNew       1024  thrpt    5   10.073 ± 0.113  ops/us
   DotProductBenchmark.dotProductNewNew       1  thrpt    5  152.223 ± 0.854  ops/us
   DotProductBenchmark.dotProductNewNew       4  thrpt    5  114.786 ± 1.555  ops/us
   DotProductBenchmark.dotProductNewNew       6  thrpt    5   91.451 ± 0.874  ops/us
   DotProductBenchmark.dotProductNewNew       8  thrpt    5   81.767 ± 0.345  ops/us
   DotProductBenchmark.dotProductNewNew      13  thrpt    5   67.915 ± 0.889  ops/us
   DotProductBenchmark.dotProductNewNew      16  thrpt    5   64.509 ± 1.064  ops/us
   DotProductBenchmark.dotProductNewNew      25  thrpt    5   53.764 ± 1.037  ops/us
   DotProductBenchmark.dotProductNewNew      32  thrpt    5   62.759 ± 0.942  ops/us
   DotProductBenchmark.dotProductNewNew      64  thrpt    5   55.151 ± 0.396  ops/us
   DotProductBenchmark.dotProductNewNew     100  thrpt    5   37.558 ± 0.996  ops/us
   DotProductBenchmark.dotProductNewNew     128  thrpt    5   46.005 ± 0.733  ops/us
   DotProductBenchmark.dotProductNewNew     207  thrpt    5   26.135 ± 0.780  ops/us
   DotProductBenchmark.dotProductNewNew     256  thrpt    5   30.208 ± 0.115  ops/us
   DotProductBenchmark.dotProductNewNew     300  thrpt    5   22.830 ± 1.903  ops/us
   DotProductBenchmark.dotProductNewNew     512  thrpt    5   17.916 ± 0.216  ops/us
   DotProductBenchmark.dotProductNewNew     702  thrpt    5   12.854 ± 1.727  ops/us
   DotProductBenchmark.dotProductNewNew    1024  thrpt    5   11.620 ± 0.291  ops/us
   DotProductBenchmark.dotProductOld          1  thrpt    5  162.477 ± 3.116  ops/us
   DotProductBenchmark.dotProductOld          4  thrpt    5  120.188 ± 2.748  ops/us
   DotProductBenchmark.dotProductOld          6  thrpt    5  120.427 ± 1.619  ops/us
   DotProductBenchmark.dotProductOld          8  thrpt    5   98.704 ± 2.279  ops/us
   DotProductBenchmark.dotProductOld         13  thrpt    5   76.331 ± 1.940  ops/us
   DotProductBenchmark.dotProductOld         16  thrpt    5   67.417 ± 1.456  ops/us
   DotProductBenchmark.dotProductOld         25  thrpt    5   47.443 ± 0.513  ops/us
   DotProductBenchmark.dotProductOld         32  thrpt    5   43.270 ± 4.112  ops/us
   DotProductBenchmark.dotProductOld         64  thrpt    5   26.506 ± 0.826  ops/us
   DotProductBenchmark.dotProductOld        100  thrpt    5   16.793 ± 0.163  ops/us
   DotProductBenchmark.dotProductOld        128  thrpt    5   14.332 ± 0.207  ops/us
   DotProductBenchmark.dotProductOld        207  thrpt    5    9.032 ± 0.155  ops/us
   DotProductBenchmark.dotProductOld        256  thrpt    5    7.853 ± 0.115  ops/us
   DotProductBenchmark.dotProductOld        300  thrpt    5    6.331 ± 0.025  ops/us
   DotProductBenchmark.dotProductOld        512  thrpt    5    4.027 ± 0.023  ops/us
   DotProductBenchmark.dotProductOld        702  thrpt    5    2.762 ± 0.041  ops/us
   DotProductBenchmark.dotProductOld       1024  thrpt    5    2.003 ± 0.020  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201914722


##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,455 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(float[] a, float[] b) {
+    int i = 0;
+    float sum = 0;
+    float norm1 = 0;
+    float norm2 = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector sum1 = FloatVector.zero(SPECIES);
+      FloatVector sum2 = FloatVector.zero(SPECIES);
+      FloatVector sum3 = FloatVector.zero(SPECIES);
+      FloatVector sum4 = FloatVector.zero(SPECIES);
+      FloatVector norm1_1 = FloatVector.zero(SPECIES);
+      FloatVector norm1_2 = FloatVector.zero(SPECIES);
+      FloatVector norm1_3 = FloatVector.zero(SPECIES);
+      FloatVector norm1_4 = FloatVector.zero(SPECIES);
+      FloatVector norm2_1 = FloatVector.zero(SPECIES);
+      FloatVector norm2_2 = FloatVector.zero(SPECIES);
+      FloatVector norm2_3 = FloatVector.zero(SPECIES);
+      FloatVector norm2_4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        sum2 = sum2.add(vc.mul(vd));
+        norm1_2 = norm1_2.add(vc.mul(vc));
+        norm2_2 = norm2_2.add(vd.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        sum3 = sum3.add(ve.mul(vf));
+        norm1_3 = norm1_3.add(ve.mul(ve));
+        norm2_3 = norm2_3.add(vf.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        sum4 = sum4.add(vg.mul(vh));
+        norm1_4 = norm1_4.add(vg.mul(vg));
+        norm2_4 = norm2_4.add(vh.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        sum1 = sum1.add(va.mul(vb));
+        norm1_1 = norm1_1.add(va.mul(va));
+        norm2_1 = norm2_1.add(vb.mul(vb));
+      }
+      // reduce
+      FloatVector sumres1 = sum1.add(sum2);
+      FloatVector sumres2 = sum3.add(sum4);
+      FloatVector norm1res1 = norm1_1.add(norm1_2);
+      FloatVector norm1res2 = norm1_3.add(norm1_4);
+      FloatVector norm2res1 = norm2_1.add(norm2_2);
+      FloatVector norm2res2 = norm2_3.add(norm2_4);
+      sum += sumres1.add(sumres2).reduceLanes(VectorOperators.ADD);
+      norm1 += norm1res1.add(norm1res2).reduceLanes(VectorOperators.ADD);
+      norm2 += norm2res1.add(norm2res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float elem1 = a[i];
+      float elem2 = b[i];
+      sum += elem1 * elem2;
+      norm1 += elem1 * elem1;
+      norm2 += elem2 * elem2;
+    }
+    return (float) (sum / Math.sqrt(norm1 * norm2));
+  }
+
+  @Override
+  public float squareDistance(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff1 = va.sub(vb);
+        acc1 = acc1.add(diff1.mul(diff1));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        FloatVector diff2 = vc.sub(vd);
+        acc2 = acc2.add(diff2.mul(diff2));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        FloatVector diff3 = ve.sub(vf);
+        acc3 = acc3.add(diff3.mul(diff3));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        FloatVector diff4 = vg.sub(vh);
+        acc4 = acc4.add(diff4.mul(diff4));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        FloatVector diff = va.sub(vb);
+        acc1 = acc1.add(diff.mul(diff));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      float diff = a[i] - b[i];
+      res += diff * diff;
+    }
+    return res;
+  }
+
+  // Binary functions, these all follow a general pattern like this:
+  //
+  //   short intermediate = a * b;
+  //   int accumulator = accumulator + intermediate;
+  //
+  // 256 or 512 bit vectors can process 64 or 128 bits at a time, respectively
+  // intermediate results use 128 or 256 bit vectors, respectively
+  // final accumulator uses 256 or 512 bit vectors, respectively
+  //
+  // We also support 128 bit vectors, using two 128 bit accumulators.
+  // This is slower but still faster than not vectorizing at all.
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  static final int INT_SPECIES_PREFERRED_BIT_SIZE = IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // compute vectorized dot product consistent with VPDPBUSD instruction, acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short product = (short) (x[i] * y[i]);
+      //   sum += product;
+      // }
+      if (INT_SPECIES_PREFERRED_BIT_SIZE >= 256) {
+        // optimized 256/512 bit implementation, processes 8/16 bytes at a time
+        int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length);
+        IntVector acc = IntVector.zero(IntVector.SPECIES_PREFERRED);
+        for (; i < upperBound; i += PREFERRED_BYTE_SPECIES.length()) {
+          ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i);
+          ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i);
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, PREFERRED_SHORT_SPECIES, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          Vector<Integer> prod32 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_PREFERRED, 0);
+          acc = acc.add(prod32);
+        }
+        // reduce
+        res += acc.reduceLanes(VectorOperators.ADD);
+      } else {
+        // 128-bit implementation, which must "split up" vectors due to widening conversions
+        int upperBound = ByteVector.SPECIES_64.loopBound(a.length);
+        IntVector acc1 = IntVector.zero(IntVector.SPECIES_128);
+        IntVector acc2 = IntVector.zero(IntVector.SPECIES_128);
+        for (; i < upperBound; i += ByteVector.SPECIES_64.length()) {
+          ByteVector va8 = ByteVector.fromArray(ByteVector.SPECIES_64, a, i);
+          ByteVector vb8 = ByteVector.fromArray(ByteVector.SPECIES_64, b, i);
+          // expand each byte vector into short vector and multiply
+          Vector<Short> va16 = va8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> vb16 = vb8.convertShape(VectorOperators.B2S, ShortVector.SPECIES_128, 0);
+          Vector<Short> prod16 = va16.mul(vb16);
+          // split each short vector into two int vectors and add
+          Vector<Integer> prod32_1 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 0);
+          Vector<Integer> prod32_2 =
+              prod16.convertShape(VectorOperators.S2I, IntVector.SPECIES_128, 1);
+          acc1 = acc1.add(prod32_1);
+          acc2 = acc2.add(prod32_2);
+        }
+        // reduce
+        res += acc1.add(acc2).reduceLanes(VectorOperators.ADD);
+      }
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  @Override
+  public float cosine(byte[] a, byte[] b) {
+    int i = 0;
+    int sum = 0;
+    int norm1 = 0;
+    int norm2 = 0;
+    // only vectorize if we'll at least enter the loop a single time, and we have at least 128-bit
+    // vectors
+    if (a.length >= 16 && INT_SPECIES_PREFERRED_BIT_SIZE >= 128) {
+      // acts like:
+      // int sum = 0;
+      // for (...) {
+      //   short difference = (short) (x[i] - y[i]);
+      //   sum += (int) difference * (int) difference;
+      // }

Review Comment:
   Let's nuke. Some of this was "thinking out loud"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558966617

   I will now write a test that compares, iff the panama provider is used.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559831627

   Sorry for again changing. I was not happy with the bitsize code in the lookup function. I used a very simple approach: The constructor of the provider throws UnsupportedOperationException with a message. The provider lookup catches this exception and falls back to default provider. This looks cleanest to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559863229

   > > should we zero-pad before computing the dot-products? It wouldn't affect the result and sounds like it would be faster
   > 
   > I don't think so. the code wouldnt even know what size to pad it to. for float functions the logic currently happens to be "4 x preferred vector size", which would be a multiple of 16 on my m1 mac, multiple of 32 on my dual-core laptop, multiple of 64 on @ChrisHegarty's rocketlake. and on a machine not supporting panama vectorization, padding would just make things slower.
   
   I would not do that in Lucene code. My idea would be to just round up to next power of 2 (like chunksize in MmapDirectory) in the implementation. The problem is that the original array is to small, and we can't easily extend it without reallocate.
   
   So I agree: don't enforce it on higher level. I'd just change luceneutil to use another vector size than 100. 😉
   
   > I think instead these data scientists should have some sympathy for the hardware and use power of two sizes.
   
   👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560217999

   And my concern is not "oh but nehalem or others without AVX are ancient ancient hardware". It is that if you run qemu with all its defaults and don't specify `-cpu` flag, you get a VM without AVX. 
   
   Such VMs without AVX passed thru are very common in virtual computing environments for reasons like this. We shouldn't have incredibly trappy performance.
   
   ```
   processor	: 3
   vendor_id	: GenuineIntel
   cpu family	: 15
   model		: 107
   model name	: QEMU Virtual CPU version 2.5+
   stepping	: 1
   microcode	: 0x1
   cpu MHz		: 2496.002
   cache size	: 16384 KB
   physical id	: 0
   siblings	: 4
   core id		: 1
   cpu cores	: 2
   apicid		: 3
   initial apicid	: 3
   fpu		: yes
   fpu_exception	: yes
   cpuid level	: 13
   wp		: yes
   flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc nopl xtopology cpuid tsc_known_freq pni cx16 x2apic hypervisor lahf_lm cpuid_fault
   bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
   bogomips	: 4994.00
   clflush size	: 64
   cache_alignment	: 128
   address sizes	: 40 bits physical, 48 bits virtual
   power management:
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560213214

   As a potential workaround I tried Uwe's suggestion to look at RAMUsageEstimator code and ran it inside this VM, it works and seems to detect the situation?
   ```
   jshell> final Object vmOption = getVMOptionMethod.invoke(hotSpotBean, "UseAVX");
   vmOption ==> VM option: UseAVX value: 0  origin: DEFAULT (read-only)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Include the Panama Vector API stubs in the generated 19/20 api jars

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198138249


##########
lucene/core/src/java20/org/apache/lucene/util/JDKVectorUtilProvider.java:
##########
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorSpecies;
+
+public final class JDKVectorUtilProvider implements VectorUtil.VectorUtilProvider {

Review Comment:
   Don't make it public. We use MethodHandles to allow package private access.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556288827

   yes, i think we have to imagine it as a scalar operation that gets slower as vector size increases. i looked into it and read this answer and changed the code: https://stackoverflow.com/questions/6996764/fastest-way-to-do-horizontal-sse-vector-sum-or-other-reduction/35270026#35270026


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1557161657

   yeah dunno, i have to fix my hsdis (probably wrestle the openjdk makefile and recompile it) to really see what is happening. such an annoyance!
   
   for now since it gives 4x speedup on intel and 2x speedup on arm i will just move along. there are 4 more functions to vectorize in this file due to all the VectorSimilarity choices...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561648820

   Hi,
   I also checked the whole benchmark on Policeman Jenkins with 1 thread (default), results are in line with the rest:
   
   ```
   vendor_id       : AuthenticAMD
   cpu family      : 23
   model           : 113
   model name      : AMD Ryzen 7 3700X 8-Core Processor
   stepping        : 0
   microcode       : 0x8701013
   cpu MHz         : 3592.636
   cache size      : 512 KB
   physical id     : 0
   siblings        : 16
   core id         : 7
   cpu cores       : 8
   apicid          : 15
   initial apicid  : 15
   fpu             : yes
   fpu_exception   : yes
   cpuid level     : 16
   wp              : yes
   flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sev sev_es
   bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass retbleed smt_rsb
   bogomips        : 7186.96
   TLB size        : 3072 4K pages
   clflush size    : 64
   cache_alignment : 64
   address sizes   : 43 bits physical, 48 bits virtual
   power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
   
   Benchmark                                (size)   Mode  Cnt    Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew       1  thrpt    5  142.186 ± 0.240  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     128  thrpt    5   29.025 ± 0.364  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     207  thrpt    5   19.972 ± 0.124  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     256  thrpt    5   18.873 ± 0.113  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     300  thrpt    5   16.772 ± 0.188  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     512  thrpt    5   11.109 ± 0.566  ops/us
   BinaryCosineBenchmark.cosineDistanceNew     702  thrpt    5    7.065 ± 0.253  ops/us
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5    5.402 ± 0.013  ops/us
   BinaryCosineBenchmark.cosineDistanceOld       1  thrpt    5  139.518 ± 0.139  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     128  thrpt    5    5.772 ± 0.841  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     207  thrpt    5    4.008 ± 0.005  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     256  thrpt    5    3.118 ± 0.026  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     300  thrpt    5    2.677 ± 0.081  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     512  thrpt    5    1.547 ± 0.028  ops/us
   BinaryCosineBenchmark.cosineDistanceOld     702  thrpt    5    1.075 ± 0.023  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5    0.731 ± 0.061  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5   13.747 ± 0.030  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5    1.972 ± 0.030  ops/us
   BinarySquareBenchmark.squareDistanceNew       1  thrpt    5  201.408 ± 0.818  ops/us
   BinarySquareBenchmark.squareDistanceNew     128  thrpt    5   62.571 ± 0.482  ops/us
   BinarySquareBenchmark.squareDistanceNew     207  thrpt    5   36.904 ± 0.053  ops/us
   BinarySquareBenchmark.squareDistanceNew     256  thrpt    5   37.545 ± 0.059  ops/us
   BinarySquareBenchmark.squareDistanceNew     300  thrpt    5   32.564 ± 0.522  ops/us
   BinarySquareBenchmark.squareDistanceNew     512  thrpt    5   22.078 ± 0.365  ops/us
   BinarySquareBenchmark.squareDistanceNew     702  thrpt    5   15.582 ± 0.208  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5   11.591 ± 0.021  ops/us
   BinarySquareBenchmark.squareDistanceOld       1  thrpt    5  202.172 ± 0.352  ops/us
   BinarySquareBenchmark.squareDistanceOld     128  thrpt    5   12.769 ± 0.040  ops/us
   BinarySquareBenchmark.squareDistanceOld     207  thrpt    5    8.054 ± 0.023  ops/us
   BinarySquareBenchmark.squareDistanceOld     256  thrpt    5    6.628 ± 0.009  ops/us
   BinarySquareBenchmark.squareDistanceOld     300  thrpt    5    5.627 ± 0.058  ops/us
   BinarySquareBenchmark.squareDistanceOld     512  thrpt    5    3.352 ± 0.019  ops/us
   BinarySquareBenchmark.squareDistanceOld     702  thrpt    5    2.462 ± 0.013  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5    1.725 ± 0.009  ops/us
   FloatCosineBenchmark.cosineNew                1  thrpt    5  159.087 ± 1.009  ops/us
   FloatCosineBenchmark.cosineNew                4  thrpt    5  124.505 ± 1.317  ops/us
   FloatCosineBenchmark.cosineNew                6  thrpt    5  103.466 ± 0.186  ops/us
   FloatCosineBenchmark.cosineNew                8  thrpt    5   83.348 ± 0.438  ops/us
   FloatCosineBenchmark.cosineNew               13  thrpt    5   67.049 ± 1.102  ops/us
   FloatCosineBenchmark.cosineNew               16  thrpt    5   59.020 ± 0.061  ops/us
   FloatCosineBenchmark.cosineNew               25  thrpt    5   52.781 ± 0.065  ops/us
   FloatCosineBenchmark.cosineNew               32  thrpt    5   58.864 ± 1.126  ops/us
   FloatCosineBenchmark.cosineNew               64  thrpt    5   51.921 ± 0.573  ops/us
   FloatCosineBenchmark.cosineNew              100  thrpt    5   37.997 ± 0.137  ops/us
   FloatCosineBenchmark.cosineNew              128  thrpt    5   38.488 ± 0.305  ops/us
   FloatCosineBenchmark.cosineNew              207  thrpt    5   24.386 ± 0.418  ops/us
   FloatCosineBenchmark.cosineNew              256  thrpt    5   26.842 ± 0.145  ops/us
   FloatCosineBenchmark.cosineNew              300  thrpt    5   22.089 ± 0.236  ops/us
   FloatCosineBenchmark.cosineNew              512  thrpt    5   18.735 ± 0.145  ops/us
   FloatCosineBenchmark.cosineNew              702  thrpt    5   12.019 ± 0.062  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   11.958 ± 0.057  ops/us
   FloatCosineBenchmark.cosineOld                1  thrpt    5  161.689 ± 0.378  ops/us
   FloatCosineBenchmark.cosineOld                4  thrpt    5  120.518 ± 1.275  ops/us
   FloatCosineBenchmark.cosineOld                6  thrpt    5  104.473 ± 0.252  ops/us
   FloatCosineBenchmark.cosineOld                8  thrpt    5   84.385 ± 0.268  ops/us
   FloatCosineBenchmark.cosineOld               13  thrpt    5   66.922 ± 0.691  ops/us
   FloatCosineBenchmark.cosineOld               16  thrpt    5   61.007 ± 0.069  ops/us
   FloatCosineBenchmark.cosineOld               25  thrpt    5   44.361 ± 0.345  ops/us
   FloatCosineBenchmark.cosineOld               32  thrpt    5   35.632 ± 0.614  ops/us
   FloatCosineBenchmark.cosineOld               64  thrpt    5   20.043 ± 0.099  ops/us
   FloatCosineBenchmark.cosineOld              100  thrpt    5   12.676 ± 0.042  ops/us
   FloatCosineBenchmark.cosineOld              128  thrpt    5   10.094 ± 0.023  ops/us
   FloatCosineBenchmark.cosineOld              207  thrpt    5    6.527 ± 0.006  ops/us
   FloatCosineBenchmark.cosineOld              256  thrpt    5    5.338 ± 0.006  ops/us
   FloatCosineBenchmark.cosineOld              300  thrpt    5    4.535 ± 0.008  ops/us
   FloatCosineBenchmark.cosineOld              512  thrpt    5    2.316 ± 0.005  ops/us
   FloatCosineBenchmark.cosineOld              702  thrpt    5    1.696 ± 0.005  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5    1.166 ± 0.004  ops/us
   FloatDotProductBenchmark.dotProductNew        1  thrpt    5  200.648 ± 1.795  ops/us
   FloatDotProductBenchmark.dotProductNew        4  thrpt    5  175.033 ± 1.564  ops/us
   FloatDotProductBenchmark.dotProductNew        6  thrpt    5  164.028 ± 1.119  ops/us
   FloatDotProductBenchmark.dotProductNew        8  thrpt    5  151.063 ± 3.535  ops/us
   FloatDotProductBenchmark.dotProductNew       13  thrpt    5  125.853 ± 0.891  ops/us
   FloatDotProductBenchmark.dotProductNew       16  thrpt    5  114.644 ± 0.717  ops/us
   FloatDotProductBenchmark.dotProductNew       25  thrpt    5   92.095 ± 3.944  ops/us
   FloatDotProductBenchmark.dotProductNew       32  thrpt    5  117.148 ± 2.873  ops/us
   FloatDotProductBenchmark.dotProductNew       64  thrpt    5   95.768 ± 1.708  ops/us
   FloatDotProductBenchmark.dotProductNew      100  thrpt    5   72.348 ± 0.212  ops/us
   FloatDotProductBenchmark.dotProductNew      128  thrpt    5   90.350 ± 1.914  ops/us
   FloatDotProductBenchmark.dotProductNew      207  thrpt    5   50.132 ± 0.275  ops/us
   FloatDotProductBenchmark.dotProductNew      256  thrpt    5   52.553 ± 1.672  ops/us
   FloatDotProductBenchmark.dotProductNew      300  thrpt    5   41.686 ± 0.490  ops/us
   FloatDotProductBenchmark.dotProductNew      512  thrpt    5   39.344 ± 0.830  ops/us
   FloatDotProductBenchmark.dotProductNew      702  thrpt    5   21.167 ± 0.281  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5   20.075 ± 0.385  ops/us
   FloatDotProductBenchmark.dotProductOld        1  thrpt    5  234.130 ± 0.134  ops/us
   FloatDotProductBenchmark.dotProductOld        4  thrpt    5  206.338 ± 1.254  ops/us
   FloatDotProductBenchmark.dotProductOld        6  thrpt    5  198.209 ± 1.156  ops/us
   FloatDotProductBenchmark.dotProductOld        8  thrpt    5  168.075 ± 1.006  ops/us
   FloatDotProductBenchmark.dotProductOld       13  thrpt    5  146.058 ± 2.204  ops/us
   FloatDotProductBenchmark.dotProductOld       16  thrpt    5  123.440 ± 0.953  ops/us
   FloatDotProductBenchmark.dotProductOld       25  thrpt    5   96.502 ± 1.815  ops/us
   FloatDotProductBenchmark.dotProductOld       32  thrpt    5   75.524 ± 0.357  ops/us
   FloatDotProductBenchmark.dotProductOld       64  thrpt    5   40.948 ± 0.120  ops/us
   FloatDotProductBenchmark.dotProductOld      100  thrpt    5   32.870 ± 0.172  ops/us
   FloatDotProductBenchmark.dotProductOld      128  thrpt    5   26.751 ± 0.066  ops/us
   FloatDotProductBenchmark.dotProductOld      207  thrpt    5   17.463 ± 0.048  ops/us
   FloatDotProductBenchmark.dotProductOld      256  thrpt    5   14.387 ± 0.105  ops/us
   FloatDotProductBenchmark.dotProductOld      300  thrpt    5   12.236 ± 0.144  ops/us
   FloatDotProductBenchmark.dotProductOld      512  thrpt    5    7.346 ± 0.003  ops/us
   FloatDotProductBenchmark.dotProductOld      702  thrpt    5    5.434 ± 0.026  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5    3.694 ± 0.004  ops/us
   FloatSquareBenchmark.squareNew                1  thrpt    5  199.365 ± 0.069  ops/us
   FloatSquareBenchmark.squareNew                4  thrpt    5  165.456 ± 0.770  ops/us
   FloatSquareBenchmark.squareNew                6  thrpt    5  152.908 ± 0.636  ops/us
   FloatSquareBenchmark.squareNew                8  thrpt    5  149.466 ± 0.089  ops/us
   FloatSquareBenchmark.squareNew               13  thrpt    5  117.959 ± 0.802  ops/us
   FloatSquareBenchmark.squareNew               16  thrpt    5  105.319 ± 0.152  ops/us
   FloatSquareBenchmark.squareNew               25  thrpt    5   89.464 ± 2.202  ops/us
   FloatSquareBenchmark.squareNew               32  thrpt    5  108.002 ± 5.605  ops/us
   FloatSquareBenchmark.squareNew               64  thrpt    5   92.649 ± 2.593  ops/us
   FloatSquareBenchmark.squareNew              100  thrpt    5   67.311 ± 1.944  ops/us
   FloatSquareBenchmark.squareNew              128  thrpt    5   77.030 ± 0.283  ops/us
   FloatSquareBenchmark.squareNew              207  thrpt    5   41.088 ± 0.375  ops/us
   FloatSquareBenchmark.squareNew              256  thrpt    5   43.918 ± 0.160  ops/us
   FloatSquareBenchmark.squareNew              300  thrpt    5   34.270 ± 0.149  ops/us
   FloatSquareBenchmark.squareNew              512  thrpt    5   29.635 ± 0.151  ops/us
   FloatSquareBenchmark.squareNew              702  thrpt    5   20.936 ± 0.076  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5   16.821 ± 0.047  ops/us
   FloatSquareBenchmark.squareOld                1  thrpt    5  200.522 ± 0.037  ops/us
   FloatSquareBenchmark.squareOld                4  thrpt    5  162.081 ± 0.349  ops/us
   FloatSquareBenchmark.squareOld                6  thrpt    5  154.646 ± 0.347  ops/us
   FloatSquareBenchmark.squareOld                8  thrpt    5  166.922 ± 9.571  ops/us
   FloatSquareBenchmark.squareOld               13  thrpt    5  110.565 ± 4.186  ops/us
   FloatSquareBenchmark.squareOld               16  thrpt    5  114.293 ± 0.599  ops/us
   FloatSquareBenchmark.squareOld               25  thrpt    5   78.714 ± 0.662  ops/us
   FloatSquareBenchmark.squareOld               32  thrpt    5   68.547 ± 0.223  ops/us
   FloatSquareBenchmark.squareOld               64  thrpt    5   37.406 ± 0.208  ops/us
   FloatSquareBenchmark.squareOld              100  thrpt    5   23.755 ± 0.015  ops/us
   FloatSquareBenchmark.squareOld              128  thrpt    5   19.949 ± 0.057  ops/us
   FloatSquareBenchmark.squareOld              207  thrpt    5   12.188 ± 0.083  ops/us
   FloatSquareBenchmark.squareOld              256  thrpt    5   10.329 ± 0.032  ops/us
   FloatSquareBenchmark.squareOld              300  thrpt    5    8.742 ± 0.075  ops/us
   FloatSquareBenchmark.squareOld              512  thrpt    5    5.379 ± 0.030  ops/us
   FloatSquareBenchmark.squareOld              702  thrpt    5    3.852 ± 0.042  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5    2.609 ± 0.044  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556403183

   I pushed a new benchmark to https://github.com/rmuir/vectorbench for the binary dot product. 
   
   Basically this has to act like:
   ```
   int sum = 0;
   for (...) {
     short product = (short) (a[i] * b[i]);
     sum += (int) product;
   }
   ```
   
   So it is tricky to do with totally generic implementation (just using SPECIES_PREFERRED). for avx256, it means you read byte vector of length 32 and then work each half as short (2 short vectors of length 16), and then the same thing again for each half as int (4 int vectors of length 8). This generic approach only gives me a 2x speedup which is a little disappointing.
   
   but this is a stupid approach if you have 256-bit vectors. You can just use ByteVector.SPECIES_64, ShortVector.SPECIES_128, and IntVector.SPECIES_256 and the whole thing is much faster. 
   
   on my skylake (has avx 256 and gets the optimized 256-bit impl)
   ```
   Benchmark                                (size)   Mode  Cnt    Score   Error   Units
   BinaryDotProductBenchmark.dotProductNew       1  thrpt    5  159.476 ± 8.177  ops/us
   BinaryDotProductBenchmark.dotProductNew     128  thrpt    5   41.759 ± 0.267  ops/us
   BinaryDotProductBenchmark.dotProductNew     207  thrpt    5   25.094 ± 0.107  ops/us
   BinaryDotProductBenchmark.dotProductNew     256  thrpt    5   24.841 ± 0.124  ops/us
   BinaryDotProductBenchmark.dotProductNew     300  thrpt    5   19.624 ± 0.891  ops/us
   BinaryDotProductBenchmark.dotProductNew     512  thrpt    5   13.763 ± 0.171  ops/us
   BinaryDotProductBenchmark.dotProductNew     702  thrpt    5    9.792 ± 0.388  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5    6.878 ± 0.834  ops/us
   BinaryDotProductBenchmark.dotProductOld       1  thrpt    5  160.423 ± 6.845  ops/us
   BinaryDotProductBenchmark.dotProductOld     128  thrpt    5   13.300 ± 0.159  ops/us
   BinaryDotProductBenchmark.dotProductOld     207  thrpt    5    8.678 ± 0.293  ops/us
   BinaryDotProductBenchmark.dotProductOld     256  thrpt    5    6.892 ± 0.331  ops/us
   BinaryDotProductBenchmark.dotProductOld     300  thrpt    5    6.008 ± 0.438  ops/us
   BinaryDotProductBenchmark.dotProductOld     512  thrpt    5    3.613 ± 0.192  ops/us
   BinaryDotProductBenchmark.dotProductOld     702  thrpt    5    2.710 ± 0.167  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5    1.825 ± 0.125  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201931143


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   We would need to spawn a separate VM from inside the main TestVectorUtil test (we have other tests that explicitely crush the JVM to check if all recovers correctly).
   A native way so we can run a single test with different options by randomized runner without refactoring a major amount of the build isn't so easy. It is doable by adding another Gradle "testVectors" task which is hooeked as dependency of "test", so runs separately.
   
   We can easily compare the outputs of both implementations by instantiating the package-private Provider classes from a test and compare results. An easy check to do this is: Get the autoloaded PROVIDER (make it package private) and do `assumeFalse(VectorUtils.PROVIDER instanceof DefaultVectorProvider)`. So it will not execute the test if the provider used is the default provider. Otherwise instantiate a `DefaultVectorProvider` using its pkg-private ctor from the test and then compare results with the default provider (which is JDK from MR-JAR).
   
   For bytes it works easy, for floats we would need a large enough epsilon when comparing the resulting floats. Or do I miss something here. There are differences in results, but Assert.assertEquals() is available for floats with an epsilon.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559022634

   > Apologies, I'm don't have much time early this week, but hope to be more active later in the week. I know that there are still outstanding questions around testing, luceneutil, etc, but I think that Robert's benchmark has been invaluable and laid solid foundations here, so I fully expect that we'll be able to work through the other outstanding issues. Overall the code is really coming together. Noice!
   
   Thank you for the help so far. I'm in the opposite boat and will be busy for the rest of week. I do want to find time to revisit the Binary* functions on ARM eventually (I just hate doing basically anything on the Mac). Otherwise, a lot of this code is just messy. Sorry, I didn't make really any attempt for it to have any quality whatsoever. Last time i wrote code to this vector API it just sat on the shelf for years, so writing clean code seemed like a premature optimization, or at least that's my excuse. we can fix it up :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1563071137

   @mikemccand: This file is a relic in your checkout. Just clean your working copy. The file/folder was deleted by refactoring the build system to support mmap and vectors at same time (files/dirs renamed).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1556297297

   i pushed one more commit to improve for "unaligned" vectors. the way to think about it, with unrolling, we do 64-at-a-time on avx512. 
   
   So it isn't good to do worst-case 63 scalar computations just because user had 1023 dimensions or something like that. better to be bounded to 15. It makes things more well-rounded and prevents seeing slowdowns for sizes such as 702 in the test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559765890

   > should we zero-pad before computing the dot-products? It wouldn't affect the result and sounds like it would be faster
   
   I don't think so. the code wouldnt even know what size to pad it to. for float functions the logic currently happens to be "4 x preferred vector size", which would be a multiple of 16 on my m1 mac, multiple of 32 on my dual-core laptop, multiple of 64 on @ChrisHegarty's rocketlake. and on a machine not supporting panama vectorization, padding would just make things slower.
   
   I think instead these data scientists should have some sympathy for the hardware and use power of two sizes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560972929

   to be clear, i'm not sure its an issue of SPECIES_PREFERRED. As you can see, 128-bit floating point works perfectly with `-XX:UseAVX=0`. I assume using SSE etc :)
   
   But the integer math does not. The conversion instructions etc needed should all be there (sse4.1 is available): https://www.felixcloutier.com/x86/pmovsx . Maybe intrinsics are missing for the integer side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1204255730


##########
gradle/java/memorysegment-mrjar.gradle:
##########
@@ -40,6 +40,7 @@ configure(project(":lucene:core")) {
           "-Xlint:-options",
           "--patch-module", "java.base=${apijar}",
           "--add-exports", "java.base/java.lang.foreign=ALL-UNNAMED",
+          "--add-exports", "java.base/jdk.incubator.vector=ALL-UNNAMED", // This is a hack, but does it matter

Review Comment:
   I agree, it is a hack, but we should maybe say "it's ok because it works for compilation and has no effect on runtime or classfile output.
   
   Renaming the apijars to be more general should be a separate PR as it needs aligning and conflict resolution with the already open java 21 one for mmap.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201860585


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   And/Or, if we have the Panama implementation available we could assert, in TestVectorUtil, that all test results are equivalent to the scalar / default impl?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201991364


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   this is what i did in the benchmarks to try to prevent mistakes as well: https://github.com/rmuir/vectorbench/blob/main/src/main/java/testing/FloatCosineBenchmark.java#L40-L43



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558965655

   I renamed the constants a bit (e.g. SPECIES at top was inconsistent with the rest)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201931143


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,11 +119,16 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
-      
+
       // Lucene needs to optional modules at runtime, which we want to enforce for testing
       // (if the runner JVM does not support them, it will fail tests):
       jvmArgs '--add-modules', 'jdk.unsupported,jdk.management'
 
+      // Enable the vector incubator module on supported Java versions:
+      if (rootProject.vectorIncubatorJavaVersions.contains(rootProject.runtimeJavaVersion)) {
+        jvmArgs '--add-modules', 'jdk.incubator.vector'
+      }
+

Review Comment:
   We would need to spawn a separate VM from inside the main TestVectorUtil test (we have other tests that explicitely crush the JVM to check if all recovers correctly).
   A native way so we can run a single test with different options by randomized runner without refactoring a major amount of the build isn't so easy. It is doable by adding another Gradle "testVectors" task which is hooeked as dependency of "test", so runs separately.
   
   We can easily compare the outputs of both implementations by instantiating the package-private Provider classes from a test and compare results.
   
   For bytes it works easy, for floats we would need a large enough epsilon when comparing the resulting floats. Or do I miss something here. There are differences in results, but Assert.assertEquals() is available for floats with an epsilon.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1561364771

   another set of results from @markrmiller 's AVX-512 beast, that he nicely gave me access to: 
   
   Model name:            11th Gen Intel(R) Core(TM) i9-11900F @ 2.50GHz
   
   java -jar target/vectorbench.jar -p size=1024
   ```
   Benchmark                                (size)   Mode  Cnt   Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5  10.637 ± 0.068  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5   1.115 ± 0.008  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  22.050 ± 0.007  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   3.349 ± 0.041  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5  16.215 ± 0.129  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   2.479 ± 0.032  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   9.394 ± 0.048  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   0.750 ± 0.002  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  25.657 ± 2.105  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   3.320 ± 0.079  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  19.437 ± 0.122  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   2.355 ± 0.003  ops/us
   ```
   
   I also tried maxing out the hardware by using all the threads to make sure it all still performs ok:
   java -jar target/vectorbench.jar -p size=1024 -t max
   ```
   Benchmark                                (size)   Mode  Cnt    Score   Error   Units
   BinaryCosineBenchmark.cosineDistanceNew    1024  thrpt    5   73.420 ± 1.642  ops/us
   BinaryCosineBenchmark.cosineDistanceOld    1024  thrpt    5    9.705 ± 0.111  ops/us
   BinaryDotProductBenchmark.dotProductNew    1024  thrpt    5  143.957 ± 0.496  ops/us
   BinaryDotProductBenchmark.dotProductOld    1024  thrpt    5   22.669 ± 0.334  ops/us
   BinarySquareBenchmark.squareDistanceNew    1024  thrpt    5  108.838 ± 0.708  ops/us
   BinarySquareBenchmark.squareDistanceOld    1024  thrpt    5   18.895 ± 0.305  ops/us
   FloatCosineBenchmark.cosineNew             1024  thrpt    5   62.930 ± 0.707  ops/us
   FloatCosineBenchmark.cosineOld             1024  thrpt    5   13.022 ± 0.735  ops/us
   FloatDotProductBenchmark.dotProductNew     1024  thrpt    5  173.105 ± 4.123  ops/us
   FloatDotProductBenchmark.dotProductOld     1024  thrpt    5   24.464 ± 0.322  ops/us
   FloatSquareBenchmark.squareNew             1024  thrpt    5  127.718 ± 1.173  ops/us
   FloatSquareBenchmark.squareOld             1024  thrpt    5   17.301 ± 0.365  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559035945

   couple of random TODOs i had to look into, we can think about:
   * disable panama provider unless user has at least 128-bit vectors? Nobody has benched the case where SPECIES_PREFERRED is 64-bits: that's the case where vectorization isnt enabled at all for some reason (VM, unsupported platform). I am suspicious it will do any good at all for these algorithms. we can try to test it with QEMU if we really care.
   * detect C1-only ("client mode") and don't enable panama provider then either. We already have this issue in our own test suite. This one is even worse because java side "thinks it has vectors" but compiler does something excruciating slow.
   * try to organize the benchmarks in a better way, than being at github.com/rmuir? I don't really like touching this code without verifying that. In my experience before, its very easy to change one small thing and make it all go very slow (e.g. stuff starts getting bounds-checked or something)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560173828

   and yeah, given the perf traps here, lets keep the 128-bit check. I am afraid of the unknown (e.g. something other than x86/arm) and i think its better to err on the side of caution and only turn on the panama support when it has passed a gauntlet of checks that will avoid performance traps for users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "msokolov (via GitHub)" <gi...@apache.org>.
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1563324390

   just some extra data about theg above testing: I was using jdk20 for the candidate and jdk17 for the baseline (by accident I guess). Both were using `--add-modules jdk.incubator.vector`. The machine I ran on is an Intel with `avx2` `sse4_1` `sse4_2` and lots of other flags - I can post if someone thinks it might matter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1563406519

   Are you sure you run with default JVM options like tiered compilation and not xbatch?
   
   Are tests running long enough, each run at least a minute?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564518097

   Hi, I changed te CHANGES.txt entry in main and 9.x to correctly refer to ARM's chipset feature (NEON). @rmuir asked me to correct it. See: https://en.wikipedia.org/wiki/ARM_architecture_family#Advanced_SIMD_(Neon)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201701447


##########
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##########
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. */
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
+
+  VectorUtilPanamaProvider() {}
+
+  @Override
+  public float dotProduct(float[] a, float[] b) {
+    int i = 0;
+    float res = 0;
+    // if the array size is large (> 2x platform vector size), its worth the overhead to vectorize
+    if (a.length > 2 * SPECIES.length()) {
+      // vector loop is unrolled 4x (4 accumulators in parallel)
+      FloatVector acc1 = FloatVector.zero(SPECIES);
+      FloatVector acc2 = FloatVector.zero(SPECIES);
+      FloatVector acc3 = FloatVector.zero(SPECIES);
+      FloatVector acc4 = FloatVector.zero(SPECIES);
+      int upperBound = SPECIES.loopBound(a.length - 3 * SPECIES.length());
+      for (; i < upperBound; i += 4 * SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+        FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
+        FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
+        acc2 = acc2.add(vc.mul(vd));
+        FloatVector ve = FloatVector.fromArray(SPECIES, a, i + 2 * SPECIES.length());
+        FloatVector vf = FloatVector.fromArray(SPECIES, b, i + 2 * SPECIES.length());
+        acc3 = acc3.add(ve.mul(vf));
+        FloatVector vg = FloatVector.fromArray(SPECIES, a, i + 3 * SPECIES.length());
+        FloatVector vh = FloatVector.fromArray(SPECIES, b, i + 3 * SPECIES.length());
+        acc4 = acc4.add(vg.mul(vh));
+      }
+      // vector tail: less scalar computations for unaligned sizes, esp with big vector sizes
+      upperBound = SPECIES.loopBound(a.length);
+      for (; i < upperBound; i += SPECIES.length()) {
+        FloatVector va = FloatVector.fromArray(SPECIES, a, i);
+        FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
+        acc1 = acc1.add(va.mul(vb));
+      }
+      // reduce
+      FloatVector res1 = acc1.add(acc2);
+      FloatVector res2 = acc3.add(acc4);
+      res += res1.add(res2).reduceLanes(VectorOperators.ADD);
+    }
+
+    for (; i < a.length; i++) {
+      res += b[i] * a[i];
+    }
+    return res;
+  }
+
+  static final VectorSpecies<Byte> PREFERRED_BYTE_SPECIES;
+  static final VectorSpecies<Short> PREFERRED_SHORT_SPECIES;
+
+  static {
+    if (IntVector.SPECIES_PREFERRED.vectorBitSize() >= 256) {
+      PREFERRED_BYTE_SPECIES =
+          ByteVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+      PREFERRED_SHORT_SPECIES =
+          ShortVector.SPECIES_MAX.withShape(
+              VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+    } else {
+      PREFERRED_BYTE_SPECIES = null;
+      PREFERRED_SHORT_SPECIES = null;
+    }
+  }
+
+  @Override
+  public int dotProduct(byte[] a, byte[] b) {
+    int i = 0;
+    int res = 0;
+    final int vectorSize = IntVector.SPECIES_PREFERRED.vectorBitSize();

Review Comment:
   ++ Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "mikemccand (via GitHub)" <gi...@apache.org>.
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1575171189

   > They killed AVX-512 on consumer hardware a while back.
   
   Boo.  Should have gone for an AMD Ryzen build ...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "ChrisHegarty (via GitHub)" <gi...@apache.org>.
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555874325

   Just dumping an initial round of benchmark results, etc, based on what is currently in this PR. 
   
   <details>
   
   <summary>Benchmark source (derived from Robert's)</summary>
   
   ```
   davekim$ cat src/main/java/testing/DotProductBenchmark.java 
   package testing;
   
   import org.openjdk.jmh.annotations.*;
   
   import java.util.concurrent.atomic.AtomicInteger;
   import java.util.concurrent.ThreadLocalRandom;
   import java.util.concurrent.TimeUnit;
   import java.util.stream.IntStream;
   
   import jdk.incubator.vector.FloatVector;
   import jdk.incubator.vector.VectorOperators;
   import jdk.incubator.vector.VectorSpecies;
   
   @BenchmarkMode(Mode.Throughput)
   @OutputTimeUnit(TimeUnit.MICROSECONDS)
   @State(Scope.Benchmark)
   @Warmup(iterations = 3, time = 3)
   @Measurement(iterations = 5, time = 3)
   @Fork(value = 1, jvmArgsPrepend = {"--add-modules=jdk.incubator.vector"})
   public class DotProductBenchmark {
   
     private float[] a;
     private float[] b;
   
     @Param({"1", "4", "6", "8", "13", "16", "25", "32", "64", "100", "128", "207", "256", "300", "512", "702", "1024"})
     //@Param({"1", "4", "6", "8", "13", "16", "25", "32", "64", "100" })
     //@Param({"1024"})
     int size;
   
     @Setup(Level.Trial)
     public void init() {
       a = new float[size];
       b = new float[size];
       for (int i = 0; i < size; ++i) {
         a[i] = ThreadLocalRandom.current().nextFloat();
         b[i] = ThreadLocalRandom.current().nextFloat();
       }
     }
   
     static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
   
     @Benchmark
     public float dotProductNew() {
       if (a.length != b.length) {
         throw new IllegalArgumentException("vector dimensions differ: " + a.length + "!=" + b.length);
       }
       int i = 0;
       float res = 0;
       // if the array size is large (2x platform vector size), its worth the overhead to vectorize
       // vector loop is unrolled a single time (2 accumulators in parallel)
       if (a.length >= 2 * SPECIES.length()) {
         FloatVector acc1 = FloatVector.zero(SPECIES);
         FloatVector acc2 = FloatVector.zero(SPECIES);
         int upperBound = SPECIES.loopBound(a.length - SPECIES.length());
         for (; i < upperBound; i += 2 * SPECIES.length()) {
           FloatVector va = FloatVector.fromArray(SPECIES, a, i);
           FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
           acc1 = acc1.add(va.mul(vb));
           FloatVector vc = FloatVector.fromArray(SPECIES, a, i + SPECIES.length());
           FloatVector vd = FloatVector.fromArray(SPECIES, b, i + SPECIES.length());
           acc2 = acc2.add(vc.mul(vd));
         }
         res += acc1.reduceLanes(VectorOperators.ADD) + acc2.reduceLanes(VectorOperators.ADD);
       }
       for (; i < a.length; i++) {
         res += b[i] * a[i];
       }
       return res;
     }
   
     @Benchmark
     public float dotProductOld() {
       if (a.length != b.length) {
         throw new IllegalArgumentException("vector dimensions differ: " + a.length + "!=" + b.length);
       }
       float res = 0f;
       /*
        * If length of vector is larger than 8, we use unrolled dot product to accelerate the
        * calculation.
        */
       int i;
       for (i = 0; i < a.length % 8; i++) {
         res += b[i] * a[i];
       }
       if (a.length < 8) {
         return res;
       }
       for (; i + 31 < a.length; i += 32) {
         res +=
             b[i + 0] * a[i + 0]
                 + b[i + 1] * a[i + 1]
                 + b[i + 2] * a[i + 2]
                 + b[i + 3] * a[i + 3]
                 + b[i + 4] * a[i + 4]
                 + b[i + 5] * a[i + 5]
                 + b[i + 6] * a[i + 6]
                 + b[i + 7] * a[i + 7];
         res +=
             b[i + 8] * a[i + 8]
                 + b[i + 9] * a[i + 9]
                 + b[i + 10] * a[i + 10]
                 + b[i + 11] * a[i + 11]
                 + b[i + 12] * a[i + 12]
                 + b[i + 13] * a[i + 13]
                 + b[i + 14] * a[i + 14]
                 + b[i + 15] * a[i + 15];
         res +=
             b[i + 16] * a[i + 16]
                 + b[i + 17] * a[i + 17]
                 + b[i + 18] * a[i + 18]
                 + b[i + 19] * a[i + 19]
                 + b[i + 20] * a[i + 20]
                 + b[i + 21] * a[i + 21]
                 + b[i + 22] * a[i + 22]
                 + b[i + 23] * a[i + 23];
         res +=
             b[i + 24] * a[i + 24]
                 + b[i + 25] * a[i + 25]
                 + b[i + 26] * a[i + 26]
                 + b[i + 27] * a[i + 27]
                 + b[i + 28] * a[i + 28]
                 + b[i + 29] * a[i + 29]
                 + b[i + 30] * a[i + 30]
                 + b[i + 31] * a[i + 31];
       }
       for (; i + 7 < a.length; i += 8) {
         res +=
             b[i + 0] * a[i + 0]
                 + b[i + 1] * a[i + 1]
                 + b[i + 2] * a[i + 2]
                 + b[i + 3] * a[i + 3]
                 + b[i + 4] * a[i + 4]
                 + b[i + 5] * a[i + 5]
                 + b[i + 6] * a[i + 6]
                 + b[i + 7] * a[i + 7];
       }
       return res;
     }
   }
   
   
   ```
   
   </details>
   
   <details>
   
   <summary>Benchmark results</summary>
   
   ```
   Benchmark                          (size)   Mode  Cnt    Score   Error   Units
   DotProductBenchmark.dotProductNew       1  thrpt    5  486.822 ± 1.260  ops/us
   DotProductBenchmark.dotProductOld       1  thrpt    5  547.520 ± 1.362  ops/us
   
   DotProductBenchmark.dotProductNew       4  thrpt    5  276.907 ± 1.612  ops/us
   DotProductBenchmark.dotProductOld       4  thrpt    5  398.279 ± 1.195  ops/us
   
   DotProductBenchmark.dotProductNew       6  thrpt    5  273.141 ± 1.060  ops/us
   DotProductBenchmark.dotProductOld       6  thrpt    5  364.975 ± 1.939  ops/us
   
   DotProductBenchmark.dotProductNew       8  thrpt    5  219.088 ± 0.538  ops/us
   DotProductBenchmark.dotProductOld       8  thrpt    5  273.919 ± 0.897  ops/us
   
   DotProductBenchmark.dotProductNew      13  thrpt    5  186.654 ± 0.075  ops/us
   DotProductBenchmark.dotProductOld      13  thrpt    5  199.216 ± 0.476  ops/us
   
   DotProductBenchmark.dotProductNew      16  thrpt    5  160.680 ± 0.401  ops/us
   DotProductBenchmark.dotProductOld      16  thrpt    5  155.481 ± 0.382  ops/us
   
   DotProductBenchmark.dotProductNew      25  thrpt    5  103.595 ± 0.358  ops/us
   DotProductBenchmark.dotProductOld      25  thrpt    5   99.612 ± 0.262  ops/us
   
   DotProductBenchmark.dotProductNew      32  thrpt    5   84.886 ± 0.342  ops/us
   DotProductBenchmark.dotProductOld      32  thrpt    5  103.425 ± 0.364  ops/us
   
   DotProductBenchmark.dotProductNew      64  thrpt    5   78.525 ± 1.889  ops/us
   DotProductBenchmark.dotProductOld      64  thrpt    5   53.223 ± 0.108  ops/us
   
   DotProductBenchmark.dotProductNew     100  thrpt    5   60.173 ± 0.453  ops/us
   DotProductBenchmark.dotProductOld     100  thrpt    5   32.104 ± 0.027  ops/us
   
   DotProductBenchmark.dotProductNew     128  thrpt    5   64.356 ± 0.145  ops/us
   DotProductBenchmark.dotProductOld     128  thrpt    5   27.143 ± 0.083  ops/us
   
   DotProductBenchmark.dotProductNew     207  thrpt    5   35.962 ± 0.015  ops/us
   DotProductBenchmark.dotProductOld     207  thrpt    5   16.279 ± 0.253  ops/us
   
   DotProductBenchmark.dotProductNew     256  thrpt    5   49.528 ± 1.180  ops/us
   DotProductBenchmark.dotProductOld     256  thrpt    5   13.683 ± 0.137  ops/us
   
   DotProductBenchmark.dotProductNew     300  thrpt    5   36.517 ± 0.104  ops/us
   DotProductBenchmark.dotProductOld     300  thrpt    5   11.232 ± 0.007  ops/us
   
   DotProductBenchmark.dotProductNew     512  thrpt    5   36.253 ± 0.004  ops/us
   DotProductBenchmark.dotProductOld     512  thrpt    5    6.866 ± 0.078  ops/us
   
   DotProductBenchmark.dotProductNew     702  thrpt    5   17.555 ± 0.184  ops/us
   DotProductBenchmark.dotProductOld     702  thrpt    5    4.855 ± 0.020  ops/us
   
   DotProductBenchmark.dotProductNew    1024  thrpt    5   23.363 ± 0.037  ops/us
   DotProductBenchmark.dotProductOld    1024  thrpt    5    3.439 ± 0.067  ops/us
   
   ```
   
   </details>
   
   <details>
   
   <summary>Machine details (for reference)</summary>
   
   ```
   davekim$ /home/chegar/binaries/jdk-20.0.1/bin/jshell --add-modules jdk.incubator.vector
   |  Welcome to JShell -- Version 20.0.1
   |  For an introduction type: /help intro
   
   jshell> jdk.incubator.vector.FloatVector.SPECIES_PREFERRED
   $1 ==> Species[float, 16, S_512_BIT]
   
   davekim$ cat /etc/lsb-release 
   DISTRIB_ID=Ubuntu
   DISTRIB_RELEASE=23.04
   DISTRIB_CODENAME=lunar
   DISTRIB_DESCRIPTION="Ubuntu 23.04"
   
   davekim$ lscpu
   Architecture:            x86_64
     CPU op-mode(s):        32-bit, 64-bit
     Address sizes:         39 bits physical, 48 bits virtual
     Byte Order:            Little Endian
   CPU(s):                  12
     On-line CPU(s) list:   0-11
   Vendor ID:               GenuineIntel
     Model name:            11th Gen Intel(R) Core(TM) i5-11400 @ 2.60GHz
       CPU family:          6
       Model:               167
       Thread(s) per core:  2
       Core(s) per socket:  6
       Socket(s):           1
       Stepping:            1
       CPU(s) scaling MHz:  62%
       CPU max MHz:         4400.0000
       CPU min MHz:         800.0000
       BogoMIPS:            5184.00
       Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc 
                            art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 
                            xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb 
                            stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap avx512i
                            fma clflushopt intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req
                             avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm md_clear flush_l1d arch_capabilities
   Virtualization features: 
     Virtualization:        VT-x
   Caches (sum of all):     
     L1d:                   288 KiB (6 instances)
     L1i:                   192 KiB (6 instances)
     L2:                    3 MiB (6 instances)
     L3:                    12 MiB (1 instance)
   NUMA:                    
     NUMA node(s):          1
     NUMA node0 CPU(s):     0-11
   Vulnerabilities:         
     Itlb multihit:         Not affected
     L1tf:                  Not affected
     Mds:                   Not affected
     Meltdown:              Not affected
     Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
     Retbleed:              Mitigation; Enhanced IBRS
     Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
     Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
     Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
     Srbds:                 Not affected
     Tsx async abort:       Not affected
   
   ```
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555892033

   Hi,
   I rewrote the API stub extractor to use 2 passes. I also made the modules to be exported configurable. It now only extracts vectors for Java 20.
   The resulting Java 20 apijar file is now 56 Kiobytes instead of ~130. The previous code also had a small bug as it was hiding all public methods of package protected classes (it was a workaround to make file sizes smaller), but it was in fact incorrect. No it includes all classes without any additional filtering that are public or referenced as superclass/interface.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199630748


##########
lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.util.Locale;
+import java.util.Objects;
+import java.util.logging.Logger;
+
+/**
+ * A provider of VectorUtil implementations.
+ *
+ * @lucene.internal
+ */
+interface VectorUtilProvider {
+
+  /** Calculates the dot product of the given float arrays. */
+  float dotProduct(float[] a, float[] b);
+
+  // -- provider lookup mechanism
+
+  Logger LOG = Logger.getLogger(VectorUtilProvider.class.getName());
+
+  static VectorUtilProvider lookup() {
+    final int runtimeVersion = Runtime.version().feature();
+    if (runtimeVersion == 20 && useVectorAPI() && vectorModulePresentAndReadable()) {
+      try {
+        final var lookup = MethodHandles.lookup();
+        final var cls = lookup.findClass("org.apache.lucene.util.VectorUtilPanamaProvider");
+        // we use method handles, so we do not need to deal with setAccessible as we have private
+        // access through the lookup:
+        final var constr = lookup.findConstructor(cls, MethodType.methodType(void.class));
+        try {
+          return (VectorUtilProvider) constr.invoke();
+        } catch (RuntimeException | Error e) {
+          throw e;
+        } catch (Throwable th) {
+          throw new AssertionError(th);
+        }
+      } catch (NoSuchMethodException | IllegalAccessException e) {
+        throw new LinkageError(
+            "PanamaVectorUtilProvider is missing correctly typed constructor", e);
+      } catch (ClassNotFoundException cnfe) {
+        throw new LinkageError("PanamaVectorUtilProvider is missing in Lucene JAR file", cnfe);
+      }
+    } else if (runtimeVersion >= 21) {
+      LOG.warning(
+          "You are running with Java 21 or later. To make full use of the Vector API, please update Apache Lucene.");
+    }
+    return new VectorUtilDefaultProvider();
+  }
+
+  static boolean vectorModulePresentAndReadable() {
+    var opt =
+        ModuleLayer.boot().modules().stream()
+            .filter(m -> m.getName().equals("jdk.incubator.vector"))
+            .findFirst();
+    if (opt.isPresent()) {
+      VectorUtilProvider.class.getModule().addReads(opt.get());
+      return true;
+    }
+    return false;
+  }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  static boolean useVectorAPI() {
+    return Objects.equals("I", "i".toUpperCase(Locale.getDefault()));

Review Comment:
   fixed this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "rmuir (via GitHub)" <gi...@apache.org>.
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1198991426


##########
gradle/testing/defaults-tests.gradle:
##########
@@ -119,10 +119,13 @@ allprojects {
       if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
         jvmArgs '--illegal-access=deny'
       }
+
+      // Disable assertions to workaround JDK-8301190
+      jvmArgs '-da:jdk.incubator.vector.LaneType'

Review Comment:
   Yeah, i think its not good to ignore it since not all uses of lucene are server-side and someone might run it e.g. in their IDE on a Turkish machine.
   
   Couple of alternatives:
   * start with JDK-21 as our first supported release. avoids the problem easily, but means nobody can use this stuff until September
   * fall back to scalar impl (e.g. pretend vector api is not enabled) if the user has Turkish or Azeri locale and jdk version < 21?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1199198237


##########
lucene/core/src/java/org/apache/lucene/internal/vector/VectorUtilProvider.java:
##########
@@ -76,4 +77,10 @@ static boolean vectorModulePresentAndReadable() {
     }
     return false;
   }
+
+  // Workaround for JDK-8301190, avoids assertion when default locale is say tr.
+  @SuppressForbidden(reason = "required to determine if non-workable locale")
+  static boolean useVectorAPI() {
+    return 'I' == int.class.getSimpleName().toUpperCase().charAt(0);

Review Comment:
   You can remove the supress forbidden, if you use `toUppercase(Locale.getDefault())` (being explicit is allowed). 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org