You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/04/20 09:51:44 UTC

[GitHub] [spark] luhenry opened a new pull request #32253: Accelerate fallback BLAS with dev.ludovic.netlib

luhenry opened a new pull request #32253:
URL: https://github.com/apache/spark/pull/32253


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   Following https://github.com/apache/spark/pull/30810, I've continued looking for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate work done in the [`dev.ludovic.netlib`](https://github.com/luhenry/netlib/) Maven package.
   
   The `dev.ludovic.netlib` library wraps the original `com.github.fommil.netlib` library and focus on accelerating the linear algebra routines in use in Spark. When running the `org.apache.spark.ml.linalg.BLASBenchmark` benchmarking suite, I get the results at [1] on an Intel machine. Moreover, this library is thoroughly tested to return the exact same results as the reference implementation.
   
   Under the hood, it reimplements the necessary algorithms in pure autovectorization-friendly Java 8, as well as takes advantage of the Vector API and Foreign Linker API introduced in JDK 16 when available.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   Accelerates linear algebra operations when the pure-java fallback method is in use. Transparently falls back to native implementation (OpenBLAS, MKL) when available.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   No, all changes are transparent to the user.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   
   The `dev.ludovic.netlib` library has its own test suite [2]. It has also been validated by running the Spark test suite and benchmarking suite.
   
   [1] https://gist.github.com/luhenry/691d1f497595b3404a6180b6c57336e0
   [2] https://github.com/luhenry/netlib/tree/master/blas/src/test/java/dev/ludovic/netlib/blas
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617024314



##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
       <artifactId>guava</artifactId>
     </dependency>
     <dependency>
-      <groupId>com.github.fommil.netlib</groupId>
-      <artifactId>core</artifactId>
-      <version>${netlib.java.version}</version>
+      <groupId>dev.ludovic.netlib</groupId>

Review comment:
       It would block including it, as the binary releases would include the software. If you're open to it, ALv2 is obviously the least contentious w.r.t. the ASF. Others are OK. Full story: https://www.apache.org/legal/resolved.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619069785



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       > First some background - I think the issue with licensing comes from the submodules that are pulled in by all, like: https://github.com/fommil/netlib-java/blob/master/native_ref/xbuilds/linux-x86_64/pom.xml netlib-java generates and compiles native code. So things like https://mvnrepository.com/artifact/com.github.fommil.netlib/netlib-native_system-linux-x86_64/1.1 ship a .so and I believe it's that code or what it links to that is LGPL. I've honestly forgotten the details but seemed pretty clear (it may be, at least, libgfortran). At least, I recall going over this with a fine-toothed comb with Cloudera's legal department and they agreed, FWIW.
   
   I can confirm that both `netlib_ref` and `netlib_system` do link to `libgfortran` and which is licensed with LGPL.  See [[1]](https://github.com/fommil/netlib-java/blob/master/native_ref/xbuilds/linux-x86_64/pom.xml#L66) and [[2]](https://github.com/fommil/netlib-java/blob/master/native_system/xbuilds/linux-x86_64/pom.xml#L77).
   
   > Anyway, that's the reason behind all this bother with profiles. The wrapper in netlib-java is OK to include in distributions and compile against. But adding in the dependencies that bring in native code is not. Hence it's under a profile that downstream users can turn on for their own builds if desired.
   
   So if I understand correctly, the limitation is that any code that statically or dynamically links to any LGPL library is out of the picture for _distribution_ with Spark. And if any downstream user wants to do it, they can but the Open Source Spark project won't. Is that correct?
   
   > So, my question is, where is the native code coming in here to do the acceleration?
   
   The main difference that `dev.ludovic.netlib` brings compared to `com.github.fommil.netlib:native_ref` or even `com.github.fommil.netlib:native_system` is that it doesn't link explicitely to any native library (OpenBLAS, MKL, libgfortran) at build-time (there is no `-l...` linker flag). Instead, it probes at run-time (via `dlopen`/`dlsym`) for a library which name is `libblas.so` (or anything else specified by the user) and which exposes certain symbols (`cblas_dgemm`, `cblas_ddot`, for example). It doesn't distribute any such library and it doesn't link to any such library (statically or dynamically) or any of their dependencies.
   
   If the user wants to use OpenBLAS or Intel MKL, or any other BLAS-like library for that matter, it's up to them to install it on their system. Then, either the library is named `libblas.so` (or there is such a symbolic filesystem link) and is accessible via LD_LIBRARY_PATH, or the user specifies the name with `-Ddev.ludovic.netlib.blas.nativeLib` or full path with `-Ddev.ludovic.netlib.blas.nativeLibPath`.
   
   Overall `dev.ludovic.netlib` doesn't link to, distribute, nor even favor any specific native BLAS implementation. So, by extension, Spark doesn't link to, distribute, nor favor any specific native BLAS implementation or external library (like libgfortran) with this patch.
   
   > I understand Java 16+ has it built in, that's the whole point of your change, but we still want to let people flip on dependencies that will build in bindings to OpenBLAS, etc. Where would that be now?
   
   `dev.ludovic.netlib.blas.ForeignBLAS` does load an arbitrary `libblas.so` library and probes for symbols like `cblas_dgemm`, `cblas_ddot`, etc. It's using `dlopen`/`dlsym` in the JVM directly.
   
   > because I see you're just depending on netlib-java core, which doesn't have that. I'm asking if that's what blas is, etc, but not sure that's it either.
   
   `blas` contains two things:
   1. the java implementations of the BLAS libraries with fallbacks to F2jBLAS (that's VectorizedBLAS for example)
   2. the bindings for native BLAS libraries (with the Foreign Linker API for now, and JNI-based bindings in the future).
   
   The contention point is in `2. the binding for native BLAS libraries` AFAIU. If it's a blocker, I should be able to break it down into `blas` and `blas-native` packages for example.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826368715


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42443/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823148693


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826815300


   I think I have fixed the PySpark unit tests failure with https://github.com/apache/spark/pull/32253/commits/d19c52777c5193f2f9196f4b71746ab990af00e1. It's currently running on GitHub Actions at https://github.com/luhenry/spark/runs/2438377858?check_suite_focus=true. I'll ping this thread as soon as there are results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826399282


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42447/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618425073



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       I think we need to keep this profile, even if empty, to avoid failing builds that specify `-Pnetlib-lgpl` (think of it as deprecated).
   
   But, something needs to pull in the `all` artifact, or else building Spark with this profile doesn't include native libs for acceleration, right? and those need to be behind a profile. I'd not expect this profile goes away.
   
   Maybe you can show the relevant part of `mvn dependency:tree` or the result of `dev/test-dependencies.sh --replace-manifest` so we can be clear about what packages are now included at the moment. I want to make sure the default doesn't include LGPL code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826397805


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42447/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       As updated in the description, the order in which each implementations are used is the following:
   
   
   ```
   |                       | BLAS.nativeBLAS                                    | BLAS.javaBLAS                                      |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | with -Pnetlib-lgpl    | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Memory API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    --add-modules=jdk.incubator.foreign             |     --add-modules=jdk.incubator.vector on JDK16)   |
   |                       |     -Dforeign.restricted=warn)                     | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a     | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     wrapper for com.github.fommil:all              | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       | 3. fails to load, falls back to BLAS.javaBLAS in   |     wrapper for com.github.fommil:core             |
   |                       |    org.apache.spark.ml.linalg.BLAS                 |                                                    |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Memory API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    --add-modules=jdk.incubator.foreign             |     --add-modules=jdk.incubator.vector on JDK16)   |
   |                       |     -Dforeign.restricted=warn)                     | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. fails to load, falls back to BLAS.javaBLAS in   | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |    org.apache.spark.ml.linalg.BLAS                 | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       |                                                    |     wrapper for com.github.fommil:core             |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617325396



##########
File path: project/SparkBuild.scala
##########
@@ -294,7 +294,7 @@ object SparkBuild extends PomBuild {
     javaOptions ++= {
       val versionParts = System.getProperty("java.version").split("[+.\\-]+", 3)
       var major = versionParts(0).toInt
-      if (major >= 16) Seq("--add-modules=jdk.incubator.vector") else Seq.empty
+      if (major >= 16) Seq("--add-modules=jdk.incubator.vector,jdk.incubator.foreign", "-Dforeign.restricted=warn") else Seq.empty

Review comment:
       Yes, to use the code relying on the Vector API, one would need to pass `--add-modules=jdk.incubator.vector`. Otherwise, it will fall back to pure JDK8-compatible code. And the `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn` is for a native-wrapper BLAS implementation based on the [Foreign Linker API](https://openjdk.java.net/jeps/389). If that's not available, it automatically falls back to the `com.github.fommil` native wrappers.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618183340



##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
       <scope>test</scope>
     </dependency>
 
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>lapack</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>arpack</artifactId>
+    </dependency>
+
   </dependencies>
-  <profiles>

Review comment:
       I also moved this dependency to https://github.com/apache/spark/pull/32253/files/#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R3489-R3499




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826390233


   Jenkins retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823321335


   /cc @srowen @zhengruifeng @kiszk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen closed pull request #32253:
URL: https://github.com/apache/spark/pull/32253


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827652662


   **[Test build #138001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138001/testReport)** for PR 32253 at commit [`17edfeb`](https://github.com/apache/spark/commit/17edfebb92586c77c8806f2a63daaa166f524657).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618494333



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       > I believe we'll need the profile in both places - parent is more about "declaration" and child is more about actually using it.
   
   Ok, let me revert it to how it's in `master`.
   
   > The diff looks OK, but raises the question, what's in arpack, lapack, etc? does that build in native code? I'd have to go check but looks like LAPACK et al are BSD licensed, which is OK.
   
   These only rely on`com.github.fommil.netlib:core` just like `blas`. These do not include any native code, and only provide a wrapper for `com.github.fommil.netlib-java` for now. In the future, I want to provide the same native wrappers as for `blas` but, similarly, without including any actual native code, just the JNI-based wrappers.
   
   > But it looks like nothing references the all artifact from netlib, so not sure what would build in the native code to actually do acceleration, even with the profile? I'm probably missing something.
   
   I am not sure what you don't understand. What do you mean in "__what__ would build in the native code"?
   
   If I understand correctly your question, the native code is wrapped by `com.github.fommil.netlib`. For JDK16+, `dev.ludovic.netlib` does provide a wrapper using the Foreign Linker API, and falls back to using `com.github.fommil.netlib` if the Foreign Linker API is not enabled. In the future, I do want to add a JNI-based wrapper in `dev.ludovic.netlib` directly so that I don't have to rely on `com.github.fommil.netlib` itself.
   
   What I am not sure I understand in the licensing issue is what is licensed as GPL/LGPL? Looking at https://mvnrepository.com/artifact/com.github.fommil.netlib/all/1.1.2, it all seems to be licensed as BSD. So is it the Fortran/native code that's shipped with it which is licensed as GPL and which is an issue?
   
   Thank you again for helping me navigating this problem!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618448295



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       I believe we'll need the profile in both places - parent is more about "declaration" and child is more about actually using it.
   
   The diff looks OK, but raises the question, what's in `arpack`, `lapack`, etc? does that build in native code? I'd have to go check but looks like LAPACK et al are BSD licensed, which is OK.
   
   But it looks like nothing references the `all` artifact from netlib, so not sure what would build in the native code to actually do acceleration, even with the profile? I'm probably missing something.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826378284


   **[Test build #137923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137923/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
    * This patch **fails PySpark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619428947



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       No, the only dependency of interest is `com.github.fommil`. The others are junit for testing, and jmh for benchmarking (in the `dev.ludovic.netlib:benchmarks` package only).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826398299


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42447/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       As updated in the description, the order in which each implementations are used is the following:
   
   
   ```
   |                       | BLAS.nativeBLAS                                    | BLAS.javaBLAS                                      |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | with -Pnetlib-lgpl    | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Memory API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a     | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     wrapper for com.github.fommil:all              | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       | 3. fails to load, falls back to BLAS.javaBLAS in   |     wrapper for com.github.fommil:core             |
   |                       |    org.apache.spark.ml.linalg.BLAS                 |                                                    |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Memory API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. fails to load, falls back to BLAS.javaBLAS in   | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |    org.apache.spark.ml.linalg.BLAS                 | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       |                                                    |     wrapper for com.github.fommil:core             |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on pull request #32253: Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823181385


   I'm adding support for JDK 8 with [v1.0.11](https://github.com/luhenry/netlib/tree/v1.0.11) (Benchmarks at https://github.com/luhenry/netlib/actions/runs/766849718)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619297422



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       > The typical policy is that it's OK to release software that can merely make use of such libraries at runtime (without actually distributing them directly) as long as it doesn't substantially depend on their presence. I believe that dynamic linking in the way you describe is OK - just like having an SPI in JVM code that may be provided by some other GPL code at the user's runtime.
   
   What you describe is exactly how `dev.ludovic.netlib` works. It doesn't substantially depend on OpenBLAS, MKL, or any other native BLAS library to be there as it will fall back to a pure Java implementation otherwise. The transition will be transparent, the feature will be equivalent, only the performance will be affected.
   
   > My main goal is to preserve current behavior.
   
   I fully agree with that, as we don't want to break current behavior nor bring in additional and unwanted dependencies.
   
   > Right now if someone has, say, MKL on their native lib path for the JVM, and built with this alternate profile, it'd be accelerated. If you're saying that still works, but would not require this separate build profile because of the different loading strategy, that's an improvement.
   
   That's exactly how it work with `dev.ludovic.netlib` in JDK16+ today with the implementation based on the Foreign Linker API, and that's how I will want it to work with the JNI-based implementation for JDK8 and JDK11 in the future.
   
   > Have you by chance tried this integration when OpenBLAS is present to verify it makes use of it?
   
   Yes, and it's a lot faster than F2J. The results in https://github.com/apache/spark/pull/32253#issue-619173915 for `native` are with the implementation based on the Foreign Linker API. You can see for `dgemm`, `f2j` is **18-30x slower** than `native` (aka OpenBLAS). I also needed to set `LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu` (on Ubuntu 20.04) for `libblas.so` to be on `ld` path.
   
   To make use of MKL, I only need to set `LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/latest/lib/intel64:/opt/intel/oneapi/compiler/latest/linux/compiler/lib/intel64_lin` and pass `-Ddev.ludovic.netlib.blas.nativeLib=mkl_rt`.
   
   I haven't tried with other BLAS implementations.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619327385



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       Today, `dev.ludovic.netlib` does `com.github.fommil.netlib.BLAS.getInstance()` to probe whether the native libraries are installed. (This call is in `com.github.fommil:core` and is thus all good regarding licensing.) If we remove the content of the `netlib-lgpl` profile and the dependency on `com.github.fommil:all` then we will break the current behavior of probing for native BLAS implementation through `com.github.fommil:native_ref` and `com.github.fommil:native_system`.
   
   In the future, I do want to add JNI-based prober/wrapper for BLAS, LAPACK, and ARPACK (and which are compliment with licensing as discussed above), but that is not done yet. Then, we will be able to remove the reference to `com.github.fommil:all`.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619432104



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       Dumb question, this all works perfectly fine then without the profile active, without netlib-java `all`? then I think we're OK. If the effect is the same: works without the profile, accelerated with the profile (and presence of OpenBLAS etc) then I think this change is OK.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827636333


   Jenkins retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823148693


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619305241



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       OK, it sounds like your solution avoids the small but important nature of the netlib-java shim that made it problematic to include. I think it was libgfortran - the profile didn't itself install BLAS or anything. That is, it wasn't depending on BLAS implementations so much as something in the shim that was linked in. I think.
   
   I don't know this in detail enough to fully evaluate but I believe in your assessment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617099199



##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
       <scope>test</scope>
     </dependency>
 
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>lapack</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>arpack</artifactId>
+    </dependency>
+
   </dependencies>
-  <profiles>

Review comment:
       What do your libraries pull in that this profile might have? it's disabled on purpose as we can't include what it pulls in in the release itself. (We also probably need to leave the profile empty at least to avoid bulid errors on a missing profile)

##########
File path: pom.xml
##########
@@ -172,6 +172,7 @@
     <fasterxml.jackson.version>2.12.2</fasterxml.jackson.version>
     <snappy.version>1.1.8.2</snappy.version>
     <netlib.java.version>1.1.2</netlib.java.version>
+    <netlib.ludovic.dev.version>1.0.10</netlib.ludovic.dev.version>

Review comment:
       (Will need to be 1.1.0 I suppose, when it's ready)

##########
File path: project/SparkBuild.scala
##########
@@ -294,7 +294,7 @@ object SparkBuild extends PomBuild {
     javaOptions ++= {
       val versionParts = System.getProperty("java.version").split("[+.\\-]+", 3)
       var major = versionParts(0).toInt
-      if (major >= 16) Seq("--add-modules=jdk.incubator.vector") else Seq.empty
+      if (major >= 16) Seq("--add-modules=jdk.incubator.vector,jdk.incubator.foreign", "-Dforeign.restricted=warn") else Seq.empty

Review comment:
       I assume the same, that these flags are required to use the vectorized code, but, that's on someone who's experimenting with Java 16 at this point.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826368715


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42443/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       As updated in the description, the order in which each implementations are used is the following:
   
   
   ```
   |                       | BLAS.nativeBLAS                                    | BLAS.javaBLAS                                      |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | with -Pnetlib-lgpl    | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Linker API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a     | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     wrapper for com.github.fommil:all              | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       | 3. fails to load, falls back to BLAS.javaBLAS in   |     wrapper for com.github.fommil:core             |
   |                       |     org.apache.spark.ml.linalg.BLAS                |                                                    |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Linker API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. fails to load, falls back to BLAS.javaBLAS in   | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     org.apache.spark.ml.linalg.BLAS                | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       |                                                    |     wrapper for com.github.fommil:core             |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619486088



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       Long story short: yes, it works without the profile active (and is even faster, data in https://github.com/apache/spark/pull/32253#issue-619173915), and is hardware accelerated with the profile or when the Foreign Linker API is active.
   
   Let me document what gets probed and the impact of having the profile in place, I think this is going to clarify many of the questions you've had.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827652662


   **[Test build #138001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138001/testReport)** for PR 32253 at commit [`17edfeb`](https://github.com/apache/spark/commit/17edfebb92586c77c8806f2a63daaa166f524657).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827764996


   **[Test build #138001 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138001/testReport)** for PR 32253 at commit [`17edfeb`](https://github.com/apache/spark/commit/17edfebb92586c77c8806f2a63daaa166f524657).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       As updated in the description, the order in which each implementations are used is the following:
   
   |                                          | BLAS.javaBLAS | BLAS.nativeBLAS |
   | ------------------------ | --------------- | ------------------- |
   | **with** -Pnetlib-lgpl      | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br> 3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core`   |  1. `dev.ludovic.netlib.blas.NetlibNativeBLAS`, a wrapper for `com.github.fommil:all`<br>2. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`) <br>3. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS` |
   | **without** -Pnetlib-lgpl | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br>3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core`  | 1. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`)<br>2. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS`  |




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619298148



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       > (One tiny side note, I think your parent POM still references GPLv2 though the license text does not)
   
   I have fixed that in the latest master. I will push it for Spark right now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619327385



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       Today, `dev.ludovic.netlib` does call `com.github.fommil.netlib.BLAS.getInstance()` to probe whether the native libraries are installed. (This call is in `com.github.fommil:core` and is thus all good regarding licensing.) If we remove the content of the `netlib-lgpl` profile and the dependency on `com.github.fommil:all` then we will break the current behavior of probing for native BLAS implementation through `com.github.fommil:native_ref` and `com.github.fommil:native_system`.
   
   In the future, I do want to add JNI-based prober/wrapper for BLAS, LAPACK, and ARPACK (and which are compliment with licensing as discussed above), but that is not done yet. Then, we will be able to remove the reference to `com.github.fommil:all`.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r620512566



##########
File path: dev/deps/spark-deps-hadoop-3.2-hive-2.3
##########
@@ -10,6 +10,7 @@ annotations/17.0.0//annotations-17.0.0.jar
 antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
 antlr4-runtime/4.8-1//antlr4-runtime-4.8-1.jar
 aopalliance-repackaged/2.6.1//aopalliance-repackaged-2.6.1.jar
+arpack/1.3.2//arpack-1.3.2.jar

Review comment:
       Added with https://github.com/apache/spark/pull/32253/commits/17edfebb92586c77c8806f2a63daaa166f524657




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618164534



##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
       <scope>test</scope>
     </dependency>
 
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>lapack</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>arpack</artifactId>
+    </dependency>
+
   </dependencies>
-  <profiles>

Review comment:
       I have removed the dependency on `com.github.fommil.netlib:all` in `dev.ludovic.netlib`, and replaced it with `com.github.fommil.netlib:core` per your previous message. It's then still up to the Spark user to specify `-Pnetlib-lgpl` to get the native wrapper in `com.github.fommil.netlib` or not.
   
   In the longer term, I'll add JNI-based native wrappers to `dev.ludovic.netlib` so that we wouldn't have to rely on `com.github.fommil.netlib:all` at all. These native wrappers would only depend on a native BLAS implementation (OpenBLAS or Intel MKL for example) to be already installed on the machine, but wouldn't ship any GPL/LGPL code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826399282


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42447/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       As updated in the description, the order in which things are used is the following:
   
   |                                          | BLAS.javaBLAS | BLAS.nativeBLAS |
   | ------------------------ | --------------- | ------------------- |
   | **with** -Pnetlib-lgpl      | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br> 3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core`   |  1. `dev.ludovic.netlib.blas.NetlibNativeBLAS`, a wrapper for `com.github.fommil:all`<br>2. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`) <br>3. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS` |
   | **without** -Pnetlib-lgpl | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br>3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core`  | 1. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`)<br>2. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS`  |




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617519509



##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
       <scope>test</scope>
     </dependency>
 
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>lapack</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>arpack</artifactId>
+    </dependency>
+
   </dependencies>
-  <profiles>

Review comment:
       I believe that netlib has two parts, the core Java interface, and a fuller package including LGPL components. I can work with only the core - falls back to JVM code. The artifacts behind the profile are the complete "all" artifact. I think we'd have to follow the same. Anything importing "all" needs to be behind the profile. You can check the result by running `dev/test-dependencies.sh --replace-manifest` (which you'll need to do anyway) to see if the changes in the transitive dependencies have included the "all" artifact.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827781911


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138001/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r620509994



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>

Review comment:
       There is currently no uses for ARPACK nor LAPACK in mllib-local or graphx. The only place it's used is in `org.apache.spark.mllib.linalg.CholeskyDecomposition` and `org.apache.spark.mllib.linalg.EigenValueDecomposition`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827697454






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823148693






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       As updated in the description, the order in which each implementations are used is the following:
   
   
   ```
   |                       | BLAS.nativeBLAS                                    | BLAS.javaBLAS                                      |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | with -Pnetlib-lgpl    | 1. dev.ludovic.netlib.blas.NetlibNativeBLAS, a     | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |     wrapper for com.github.fommil:all              |    (JDK16+, relies on the Vector API, requires     |
   |                       | 2. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |    relies on the Foreign Linker API, requires      | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       |    `--add-modules=jdk.incubator.foreign            | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     -Dforeign.restricted=warn`)                    | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       | 3. fails to load, falls back to BLAS.javaBLAS in   |     wrapper for com.github.fommil:core             |
   |                       |     org.apache.spark.ml.linalg.BLAS                |                                                    |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Linker API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. fails to load, falls back to BLAS.javaBLAS in   | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     org.apache.spark.ml.linalg.BLAS                | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       |                                                    |     wrapper for com.github.fommil:core             |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826368715






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826888681


   The PytSpark unit tests have passed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826353943






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617343950



##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
       <scope>test</scope>
     </dependency>
 
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>lapack</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>arpack</artifactId>
+    </dependency>
+
   </dependencies>
-  <profiles>

Review comment:
       It only pulls `com.github.fommil.netlib`. The long term goal is to get rid of this dependency but it's going to require quite a lot of work still.
   
   What I don't understand is how can Spark work today if `com.github.fommil.netlib` is not installed? Are there other packages that can be dropped in place and which can seamlessly replace it? `com.github.fommil.netlib` is unconditionally added as a dependency in [graphx/pom.xml](https://github.com/apache/spark/blob/master/graphx/pom.xml#L63), and that is why Spark functions even when `-Pnetlib-lgpl` isn't defined. (The package is going to be downloaded, the jar is going to be passed to the compiler, and it's going to be on the classpath when running tests and benchmarks).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826353943


   Jenkins retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619306203



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       So if I undestand right, you can remove the `all` artifact below in the profile then? (But keep the profile as empty, with a comment that it's no longer used.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619215551



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       That's right. The ASF needs to release artifacts that are as or no more restrictive than ALv2, and that ends mostly meaning: no binary artifacts with copy-left or field-use-of-restricted licenses like the GPL family. It does turn on releasing a copy of code, or in the case of GPL, more about linking to it.
   
   The typical policy is that it's OK to release software that can merely make use of such libraries at runtime (without actually distributing them directly) as long as it doesn't substantially depend on their presence. I believe that dynamic linking in the way you describe is OK - just like having an SPI in JVM code that may be provided by some other GPL code at the user's runtime.
   
   My main goal is to preserve current behavior. Right now if someone has, say, MKL on their native lib path for the JVM, and built with this alternate profile, it'd be accelerated. If you're saying that still works, but would not require this separate build profile because of the different loading strategy, that's an improvement.
   
   If nothing that ends up in transitive dependencies can be called statically linked to, say, libgfortran or anything GPL/LGPL, I think this is fine.
   
   Have you by chance tried this integration when OpenBLAS is present to verify it makes use of it?
   
   (One tiny side note, I think your parent POM still references GPLv2 though the license text does not)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827021079


   ### JDK8:
   ```
   [info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic
   [info] Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz
   [info]
   [info] f2jBLAS    = dev.ludovic.netlib.blas.NetlibF2jBLAS
   [info] javaBLAS   = dev.ludovic.netlib.blas.Java8BLAS
   [info] nativeBLAS = dev.ludovic.netlib.blas.Java8BLAS
   [info]
   [info] daxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 223            232           8        448.0           2.2       1.0X
   [info] java                                                221            228           7        453.0           2.2       1.0X
   [info]
   [info] saxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 122            128           4        821.2           1.2       1.0X
   [info] java                                                122            128           4        822.3           1.2       1.0X
   [info]
   [info] ddot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 109            112           2        921.4           1.1       1.0X
   [info] java                                                 70             74           3       1423.5           0.7       1.5X
   [info]
   [info] sdot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  96             98           2       1046.1           1.0       1.0X
   [info] java                                                 47             49           2       2121.7           0.5       2.0X
   [info]
   [info] dscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 184            195           8        544.3           1.8       1.0X
   [info] java                                                185            196           7        539.5           1.9       1.0X
   [info]
   [info] sscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  99            104           4       1011.9           1.0       1.0X
   [info] java                                                 99            104           4       1010.4           1.0       1.0X
   [info]
   [info] dspmv[U]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        947.2           1.1       1.0X
   [info] java                                                  0              0           0       1584.8           0.6       1.7X
   [info]
   [info] dspr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        867.4           1.2       1.0X
   [info] java                                                  1              1           0        865.0           1.2       1.0X
   [info]
   [info] dsyr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        485.9           2.1       1.0X
   [info] java                                                  1              1           0        486.8           2.1       1.0X
   [info]
   [info] dgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1843.0           0.5       1.0X
   [info] java                                                  0              0           0       2690.6           0.4       1.5X
   [info]
   [info] dgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1214.7           0.8       1.0X
   [info] java                                                  0              0           0       2536.8           0.4       2.1X
   [info]
   [info] sgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1895.9           0.5       1.0X
   [info] java                                                  0              0           0       2961.1           0.3       1.6X
   [info]
   [info] sgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1223.4           0.8       1.0X
   [info] java                                                  0              0           0       3091.4           0.3       2.5X
   [info]
   [info] dgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 560            575          20       1787.1           0.6       1.0X
   [info] java                                                226            232           5       4432.4           0.2       2.5X
   [info]
   [info] dgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 570            586          23       1755.2           0.6       1.0X
   [info] java                                                227            232           4       4410.1           0.2       2.5X
   [info]
   [info] dgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 863            879          17       1158.4           0.9       1.0X
   [info] java                                                227            231           3       4407.9           0.2       3.8X
   [info]
   [info] dgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                1282           1305          23        780.0           1.3       1.0X
   [info] java                                                227            232           4       4413.4           0.2       5.7X
   [info]
   [info] sgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 538            548           8       1858.6           0.5       1.0X
   [info] java                                                221            226           3       4521.1           0.2       2.4X
   [info]
   [info] sgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 549            558          10       1819.9           0.5       1.0X
   [info] java                                                222            229           7       4503.5           0.2       2.5X
   [info]
   [info] sgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 838            852          12       1193.0           0.8       1.0X
   [info] java                                                222            229           5       4500.5           0.2       3.8X
   [info]
   [info] sgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 905            919          18       1104.8           0.9       1.0X
   [info] java                                                221            228           5       4521.3           0.2       4.1X
   ```
   
   ### JDK11:
   ```
   [info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic
   [info] Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz
   [info]
   [info] f2jBLAS    = dev.ludovic.netlib.blas.NetlibF2jBLAS
   [info] javaBLAS   = dev.ludovic.netlib.blas.Java11BLAS
   [info] nativeBLAS = dev.ludovic.netlib.blas.Java11BLAS
   [info]
   [info] daxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 195            204          10        512.7           2.0       1.0X
   [info] java                                                195            202           7        512.4           2.0       1.0X
   [info]
   [info] saxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 108            113           4        923.3           1.1       1.0X
   [info] java                                                102            107           4        984.4           1.0       1.1X
   [info]
   [info] ddot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 107            110           3        938.1           1.1       1.0X
   [info] java                                                 69             72           3       1447.1           0.7       1.5X
   [info]
   [info] sdot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  96             98           2       1046.5           1.0       1.0X
   [info] java                                                 43             45           2       2317.1           0.4       2.2X
   [info]
   [info] dscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 155            168           8        644.2           1.6       1.0X
   [info] java                                                158            169           8        632.8           1.6       1.0X
   [info]
   [info] sscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  85             90           4       1178.1           0.8       1.0X
   [info] java                                                 86             90           4       1167.7           0.9       1.0X
   [info]
   [info] dspmv[U]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   0              0           0       1182.1           0.8       1.0X
   [info] java                                                  0              0           0       1432.1           0.7       1.2X
   [info]
   [info] dspr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        898.7           1.1       1.0X
   [info] java                                                  1              1           0        891.5           1.1       1.0X
   [info]
   [info] dsyr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        495.4           2.0       1.0X
   [info] java                                                  1              1           0        495.7           2.0       1.0X
   [info]
   [info] dgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   0              0           0       2271.6           0.4       1.0X
   [info] java                                                  0              0           0       3648.1           0.3       1.6X
   [info]
   [info] dgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1229.3           0.8       1.0X
   [info] java                                                  0              0           0       2711.3           0.4       2.2X
   [info]
   [info] sgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   0              0           0       2677.5           0.4       1.0X
   [info] java                                                  0              0           0       3288.2           0.3       1.2X
   [info]
   [info] sgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1233.0           0.8       1.0X
   [info] java                                                  0              0           0       2766.3           0.4       2.2X
   [info]
   [info] dgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 520            536          16       1923.6           0.5       1.0X
   [info] java                                                214            221           7       4669.5           0.2       2.4X
   [info]
   [info] dgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 593            612          17       1686.5           0.6       1.0X
   [info] java                                                215            219           3       4643.3           0.2       2.8X
   [info]
   [info] dgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 853            870          16       1172.8           0.9       1.0X
   [info] java                                                215            218           3       4659.7           0.2       4.0X
   [info]
   [info] dgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                1350           1370          23        740.8           1.3       1.0X
   [info] java                                                215            219           4       4656.6           0.2       6.3X
   [info]
   [info] sgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 460            468           6       2173.2           0.5       1.0X
   [info] java                                                210            213           2       4752.7           0.2       2.2X
   [info]
   [info] sgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 535            544           8       1869.3           0.5       1.0X
   [info] java                                                210            215           5       4761.8           0.2       2.5X
   [info]
   [info] sgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 843            853          11       1186.8           0.8       1.0X
   [info] java                                                209            214           4       4793.4           0.2       4.0X
   [info]
   [info] sgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 891            904          15       1122.0           0.9       1.0X
   [info] java                                                209            214           4       4777.2           0.2       4.3X
   ```
   
   ### JDK16:
   ```
   [info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic
   [info] Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz
   [info]
   [info] f2jBLAS    = dev.ludovic.netlib.blas.NetlibF2jBLAS
   [info] javaBLAS   = dev.ludovic.netlib.blas.VectorizedBLAS
   [info] nativeBLAS = dev.ludovic.netlib.blas.VectorizedBLAS
   [info]
   [info] daxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 194            199           7        515.7           1.9       1.0X
   [info] java                                                181            186           3        551.1           1.8       1.1X
   [info]
   [info] saxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 109            115           4        915.0           1.1       1.0X
   [info] java                                                 88             92           3       1138.8           0.9       1.2X
   [info]
   [info] ddot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 108            110           2        922.6           1.1       1.0X
   [info] java                                                 54             56           2       1839.2           0.5       2.0X
   [info]
   [info] sdot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  96             97           2       1046.1           1.0       1.0X
   [info] java                                                 29             30           1       3393.4           0.3       3.2X
   [info]
   [info] dscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 156            165           5        643.0           1.6       1.0X
   [info] java                                                150            159           5        667.1           1.5       1.0X
   [info]
   [info] sscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  85             91           6       1171.0           0.9       1.0X
   [info] java                                                 75             79           3       1340.6           0.7       1.1X
   [info]
   [info] dspmv[U]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        917.0           1.1       1.0X
   [info] java                                                  0              0           0       8147.2           0.1       8.9X
   [info]
   [info] dspr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        859.3           1.2       1.0X
   [info] java                                                  1              1           0        859.3           1.2       1.0X
   [info]
   [info] dsyr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0        482.1           2.1       1.0X
   [info] java                                                  1              1           0        482.6           2.1       1.0X
   [info]
   [info] dgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   0              0           0       2214.2           0.5       1.0X
   [info] java                                                  0              0           0       7975.8           0.1       3.6X
   [info]
   [info] dgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1231.4           0.8       1.0X
   [info] java                                                  0              0           0       8680.9           0.1       7.0X
   [info]
   [info] sgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   0              0           0       2684.3           0.4       1.0X
   [info] java                                                  0              0           0      18527.1           0.1       6.9X
   [info]
   [info] sgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                   1              1           0       1235.4           0.8       1.0X
   [info] java                                                  0              0           0      17347.9           0.1      14.0X
   [info]
   [info] dgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 530            552          18       1887.5           0.5       1.0X
   [info] java                                                 58             64           3      17143.9           0.1       9.1X
   [info]
   [info] dgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 598            620          17       1671.1           0.6       1.0X
   [info] java                                                 58             64           3      17196.6           0.1      10.3X
   [info]
   [info] dgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 834            847          14       1199.4           0.8       1.0X
   [info] java                                                 57             63           4      17486.9           0.1      14.6X
   [info]
   [info] dgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                1338           1366          22        747.3           1.3       1.0X
   [info] java                                                 58             63           3      17356.6           0.1      23.2X
   [info]
   [info] sgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 489            501           9       2045.5           0.5       1.0X
   [info] java                                                 36             38           2      27721.9           0.0      13.6X
   [info]
   [info] sgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 478            488           9       2094.0           0.5       1.0X
   [info] java                                                 36             38           2      27813.2           0.0      13.3X
   [info]
   [info] sgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 825            837          10       1211.6           0.8       1.0X
   [info] java                                                 35             38           2      28433.1           0.0      23.5X
   [info]
   [info] sgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] ------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 900            918          15       1111.6           0.9       1.0X
   [info] java                                                 36             38           2      28073.0           0.0      25.3X
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827781911


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138001/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826423757


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137927/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827699944


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42521/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618549812



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       First some background - I think the issue with licensing comes from the submodules that are pulled in by `all`, like: https://github.com/fommil/netlib-java/blob/master/native_ref/xbuilds/linux-x86_64/pom.xml netlib-java generates and compiles native code. So things like https://mvnrepository.com/artifact/com.github.fommil.netlib/netlib-native_system-linux-x86_64/1.1 ship a `.so` and I believe it's that code or what it links to that is LGPL. I've honestly forgotten the details but seemed pretty clear (it may be, at least, libgfortran). At least, I recall going over this with a fine-toothed comb with Cloudera's legal department and they agreed, FWIW.
   
   Anyway, that's the reason behind all this bother with profiles. The wrapper in netlib-java is OK to include in distributions and compile against. But adding in the dependencies that bring in native code is not. Hence it's under a profile that downstream users can turn on for their own builds if desired.
   
   So, my question is, where is the native code coming in here to do the acceleration? I understand Java 16+ has it built in, that's the whole point of your change, but we still want to let people flip on dependencies that will build in bindings to OpenBLAS, etc. Where would that be now? because I see you're just depending on netlib-java `core`, which doesn't have that. I'm asking if that's what `blas` is, etc, but not sure that's it either.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392006


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137923/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618494333



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       > I believe we'll need the profile in both places - parent is more about "declaration" and child is more about actually using it.
   
   Ok, let me revert it to how it's in `master`.
   
   > The diff looks OK, but raises the question, what's in arpack, lapack, etc? does that build in native code? I'd have to go check but looks like LAPACK et al are BSD licensed, which is OK.
   
   These only rely on`com.github.fommil.netlib:core` just like `blas`. These do not include any native code, and only provide a wrapper for `com.github.fommil.netlib-java` for now. In the future, I want to provide the same native wrappers as for `blas` but, similarly, without including any actual native code, just the JNI-based wrappers.
   
   > But it looks like nothing references the all artifact from netlib, so not sure what would build in the native code to actually do acceleration, even with the profile? I'm probably missing something.
   
   I am not sure what you don't understand. What do you mean in "__what__ would build in the native code"?
   
   If I understand correctly your question, the native code is wrapped by `com.github.fommil.netlib` for JDK8 and JDK 11. For JDK16, `dev.ludovic.netlib` does provide a wrapper using the Foreign Linker API, and falls back to using `com.github.fommil.netlib` if the Foreign Linker API is not enabled. In the future, I do want to add a JNI-based wrapper in `dev.ludovic.netlib` directly so that I don't have to rely on `com.github.fommil.netlib` itself.
   
   What I am not sure I understand in the licensing issue is what is licensed as GPL/LGPL? Looking at https://mvnrepository.com/artifact/com.github.fommil.netlib/all/1.1.2, it all seems to be licensed as BSD. So is it the Fortran/native code that's shipped with it which is licensed as GPL and which is an issue?
   
   Thank you again for helping me navigating this problem!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826364400


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42443/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617071851



##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
       <artifactId>guava</artifactId>
     </dependency>
     <dependency>
-      <groupId>com.github.fommil.netlib</groupId>
-      <artifactId>core</artifactId>
-      <version>${netlib.java.version}</version>
+      <groupId>dev.ludovic.netlib</groupId>

Review comment:
       [v1.1.0](https://github.com/luhenry/netlib/tree/v1.1.0) relicenses to MIT.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r616916954



##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
       <artifactId>guava</artifactId>
     </dependency>
     <dependency>
-      <groupId>com.github.fommil.netlib</groupId>
-      <artifactId>core</artifactId>
-      <version>${netlib.java.version}</version>
+      <groupId>dev.ludovic.netlib</groupId>

Review comment:
       Is your project GPL? that won't fly in an ASF project I'm afraid if so. (The relationship to the netlib code is already quite tricky w.r.t. licensing)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919


   **[Test build #137923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137923/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry edited a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry edited a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826888681


   The PytSpark unit tests have passed. I'm running BLASBenchmark locally with JDK8, JDK11, and JDK16, and I'll paste the results here as soon as it's done (~1-2 hours).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       As updated in the description, the order in which each implementations are used is the following:
   
   
   ```
   |                       | BLAS.nativeBLAS                                    | BLAS.javaBLAS                                      |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | with -Pnetlib-lgpl    | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Memory API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a     | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     wrapper for com.github.fommil:all              | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       | 3. fails to load, falls back to BLAS.javaBLAS in   |     wrapper for com.github.fommil:core             |
   |                       |     org.apache.spark.ml.linalg.BLAS                |                                                    |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   | without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
   |                       |    relies on the Foreign Memory API, requires      |    (JDK16+, relies on the Vector API, requires     |
   |                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
   |                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
   |                       | 2. fails to load, falls back to BLAS.javaBLAS in   | 3. dev.ludovic.netlib.blas.JavaBLAS                |
   |                       |     org.apache.spark.ml.linalg.BLAS                | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
   |                       |                                                    |     wrapper for com.github.fommil:core             |
   | --------------------- | -------------------------------------------------- | -------------------------------------------------- |
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618439810



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       Is it enough to have this profile in `pom.xml` ([like so](https://github.com/apache/spark/pull/32253/files/#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R3489-R3499)) or does it need to be in this `mllib-local/pom.xml`?
   
   The results from `dev/test-dependencies.sh --replace-manifest` __without__ `-Pnetlib-lgpl` are at https://gist.github.com/luhenry/ceb64c52ad63c69ed0ef2aa205ddde25. The only added dependencies are the blas, lapack, and arpack packages from `dev.ludovic.netlib`.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619357668



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       I see, so `all` is still needed, and needs to be behind the profile. While not required for license reasons, do any other of your dependencies only makes sense when paired with the `all` artifact, and so should be in the profile too?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617003887



##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
       <artifactId>guava</artifactId>
     </dependency>
     <dependency>
-      <groupId>com.github.fommil.netlib</groupId>
-      <artifactId>core</artifactId>
-      <version>${netlib.java.version}</version>
+      <groupId>dev.ludovic.netlib</groupId>

Review comment:
       It is currently GPLv2 with Classpath Exception, the same as the OpenJDK. I'm open to changing the licensing if it's a blocking issue.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392006


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137923/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392271


   **[Test build #137927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137927/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827844059


   Merged to master. Thanks, this is a step forward in better acceleration for sure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919


   **[Test build #137923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137923/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392271


   **[Test build #137927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137927/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826414322


   **[Test build #137927 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137927/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
    * This patch **fails PySpark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617071851



##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
       <artifactId>guava</artifactId>
     </dependency>
     <dependency>
-      <groupId>com.github.fommil.netlib</groupId>
-      <artifactId>core</artifactId>
-      <version>${netlib.java.version}</version>
+      <groupId>dev.ludovic.netlib</groupId>

Review comment:
       [v1.1.0](https://github.com/luhenry/netlib/tree/v1.1.0) relicenses to MIT. I'll update this PR as soon as it's released (benchmarks take ~12 hours to run and I require them for a release).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826423757


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137927/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827699944


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42521/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619327385



##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>

Review comment:
       Today, `dev.ludovic.netlib` does call `com.github.fommil.netlib.BLAS.getInstance()` to probe whether the native libraries are installed. (This call is in `com.github.fommil:core` and is thus all good regarding licensing.) If we remove the content of the `netlib-lgpl` profile and the dependency on `com.github.fommil:all` then we will break the current behavior of probing for native BLAS implementation through `com.github.fommil:native_ref` and `com.github.fommil:native_system` when the user enables `-Pnetlib-lgpl`.
   
   In the future, I do want to add JNI-based prober/wrapper for BLAS, LAPACK, and ARPACK (and which are compliment with licensing as discussed above), but that is not done yet. Then, we will be able to remove the reference to `com.github.fommil:all`.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r620373383



##########
File path: dev/deps/spark-deps-hadoop-3.2-hive-2.3
##########
@@ -10,6 +10,7 @@ annotations/17.0.0//annotations-17.0.0.jar
 antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
 antlr4-runtime/4.8-1//antlr4-runtime-4.8-1.jar
 aopalliance-repackaged/2.6.1//aopalliance-repackaged-2.6.1.jar
+arpack/1.3.2//arpack-1.3.2.jar

Review comment:
       There's one more minor thing to do here. Under licenses-binary/ , include a file like `LICENSE-blas.txt` and/or separate ones for arpack, lapack artifacts as you wish (name doesn't matter, doesn't have to be 3 - really it's whatever satisifies you, as long as we have a copy that's connectable to your artifacts) with the text of the MIT license file. In `LICENSE-binary`, you'll see a list of MIT dependencies. Just add them in there too.
   
   I don't think any dependencies ultimately went away so there's nothing to remove, AFAICT.
   
   It's a formality but just ensures the license requirements are buttoned up.

##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>

Review comment:
       So graphx and mllib-local would not need ARPACK, LAPACK?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org