You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/04/20 09:51:44 UTC
[GitHub] [spark] luhenry opened a new pull request #32253: Accelerate fallback BLAS with dev.ludovic.netlib
luhenry opened a new pull request #32253:
URL: https://github.com/apache/spark/pull/32253
<!--
Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
4. Be sure to keep the PR description updated to reflect all changes.
5. Please write your PR title to summarize what this PR proposes.
6. If possible, provide a concise example to reproduce the issue for a faster review.
7. If you want to add a new configuration, please read the guideline first for naming configurations in
'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
-->
### What changes were proposed in this pull request?
<!--
Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
2. If you fix some SQL features, you can provide some references of other DBMSes.
3. If there is design documentation, please add the link.
4. If there is a discussion in the mailing list, please add the link.
-->
Following https://github.com/apache/spark/pull/30810, I've continued looking for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate work done in the [`dev.ludovic.netlib`](https://github.com/luhenry/netlib/) Maven package.
The `dev.ludovic.netlib` library wraps the original `com.github.fommil.netlib` library and focus on accelerating the linear algebra routines in use in Spark. When running the `org.apache.spark.ml.linalg.BLASBenchmark` benchmarking suite, I get the results at [1] on an Intel machine. Moreover, this library is thoroughly tested to return the exact same results as the reference implementation.
Under the hood, it reimplements the necessary algorithms in pure autovectorization-friendly Java 8, as well as takes advantage of the Vector API and Foreign Linker API introduced in JDK 16 when available.
### Why are the changes needed?
<!--
Please clarify why the changes are needed. For instance,
1. If you propose a new API, clarify the use case for a new API.
2. If you fix a bug, you can clarify why it is a bug.
-->
Accelerates linear algebra operations when the pure-java fallback method is in use. Transparently falls back to native implementation (OpenBLAS, MKL) when available.
### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such as the documentation fix.
If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
If no, write 'No'.
-->
No, all changes are transparent to the user.
### How was this patch tested?
<!--
If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
If tests were not added, please describe why they were not added and/or why it was difficult to add.
-->
The `dev.ludovic.netlib` library has its own test suite [2]. It has also been validated by running the Spark test suite and benchmarking suite.
[1] https://gist.github.com/luhenry/691d1f497595b3404a6180b6c57336e0
[2] https://github.com/luhenry/netlib/tree/master/blas/src/test/java/dev/ludovic/netlib/blas
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617024314
##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
<artifactId>guava</artifactId>
</dependency>
<dependency>
- <groupId>com.github.fommil.netlib</groupId>
- <artifactId>core</artifactId>
- <version>${netlib.java.version}</version>
+ <groupId>dev.ludovic.netlib</groupId>
Review comment:
It would block including it, as the binary releases would include the software. If you're open to it, ALv2 is obviously the least contentious w.r.t. the ASF. Others are OK. Full story: https://www.apache.org/legal/resolved.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619069785
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
> First some background - I think the issue with licensing comes from the submodules that are pulled in by all, like: https://github.com/fommil/netlib-java/blob/master/native_ref/xbuilds/linux-x86_64/pom.xml netlib-java generates and compiles native code. So things like https://mvnrepository.com/artifact/com.github.fommil.netlib/netlib-native_system-linux-x86_64/1.1 ship a .so and I believe it's that code or what it links to that is LGPL. I've honestly forgotten the details but seemed pretty clear (it may be, at least, libgfortran). At least, I recall going over this with a fine-toothed comb with Cloudera's legal department and they agreed, FWIW.
I can confirm that both `netlib_ref` and `netlib_system` do link to `libgfortran` and which is licensed with LGPL. See [[1]](https://github.com/fommil/netlib-java/blob/master/native_ref/xbuilds/linux-x86_64/pom.xml#L66) and [[2]](https://github.com/fommil/netlib-java/blob/master/native_system/xbuilds/linux-x86_64/pom.xml#L77).
> Anyway, that's the reason behind all this bother with profiles. The wrapper in netlib-java is OK to include in distributions and compile against. But adding in the dependencies that bring in native code is not. Hence it's under a profile that downstream users can turn on for their own builds if desired.
So if I understand correctly, the limitation is that any code that statically or dynamically links to any LGPL library is out of the picture for _distribution_ with Spark. And if any downstream user wants to do it, they can but the Open Source Spark project won't. Is that correct?
> So, my question is, where is the native code coming in here to do the acceleration?
The main difference that `dev.ludovic.netlib` brings compared to `com.github.fommil.netlib:native_ref` or even `com.github.fommil.netlib:native_system` is that it doesn't link explicitely to any native library (OpenBLAS, MKL, libgfortran) at build-time (there is no `-l...` linker flag). Instead, it probes at run-time (via `dlopen`/`dlsym`) for a library which name is `libblas.so` (or anything else specified by the user) and which exposes certain symbols (`cblas_dgemm`, `cblas_ddot`, for example). It doesn't distribute any such library and it doesn't link to any such library (statically or dynamically) or any of their dependencies.
If the user wants to use OpenBLAS or Intel MKL, or any other BLAS-like library for that matter, it's up to them to install it on their system. Then, either the library is named `libblas.so` (or there is such a symbolic filesystem link) and is accessible via LD_LIBRARY_PATH, or the user specifies the name with `-Ddev.ludovic.netlib.blas.nativeLib` or full path with `-Ddev.ludovic.netlib.blas.nativeLibPath`.
Overall `dev.ludovic.netlib` doesn't link to, distribute, nor even favor any specific native BLAS implementation. So, by extension, Spark doesn't link to, distribute, nor favor any specific native BLAS implementation or external library (like libgfortran) with this patch.
> I understand Java 16+ has it built in, that's the whole point of your change, but we still want to let people flip on dependencies that will build in bindings to OpenBLAS, etc. Where would that be now?
`dev.ludovic.netlib.blas.ForeignBLAS` does load an arbitrary `libblas.so` library and probes for symbols like `cblas_dgemm`, `cblas_ddot`, etc. It's using `dlopen`/`dlsym` in the JVM directly.
> because I see you're just depending on netlib-java core, which doesn't have that. I'm asking if that's what blas is, etc, but not sure that's it either.
`blas` contains two things:
1. the java implementations of the BLAS libraries with fallbacks to F2jBLAS (that's VectorizedBLAS for example)
2. the bindings for native BLAS libraries (with the Foreign Linker API for now, and JNI-based bindings in the future).
The contention point is in `2. the binding for native BLAS libraries` AFAIU. If it's a blocker, I should be able to break it down into `blas` and `blas-native` packages for example.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826368715
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42443/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823148693
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826815300
I think I have fixed the PySpark unit tests failure with https://github.com/apache/spark/pull/32253/commits/d19c52777c5193f2f9196f4b71746ab990af00e1. It's currently running on GitHub Actions at https://github.com/luhenry/spark/runs/2438377858?check_suite_focus=true. I'll ping this thread as soon as there are results.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826399282
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42447/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618425073
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
I think we need to keep this profile, even if empty, to avoid failing builds that specify `-Pnetlib-lgpl` (think of it as deprecated).
But, something needs to pull in the `all` artifact, or else building Spark with this profile doesn't include native libs for acceleration, right? and those need to be behind a profile. I'd not expect this profile goes away.
Maybe you can show the relevant part of `mvn dependency:tree` or the result of `dev/test-dependencies.sh --replace-manifest` so we can be clear about what packages are now included at the moment. I want to make sure the default doesn't include LGPL code.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826397805
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42447/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
As updated in the description, the order in which each implementations are used is the following:
```
| | BLAS.nativeBLAS | BLAS.javaBLAS |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| with -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Memory API, requires | (JDK16+, relies on the Vector API, requires |
| | --add-modules=jdk.incubator.foreign | --add-modules=jdk.incubator.vector on JDK16) |
| | -Dforeign.restricted=warn) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | wrapper for com.github.fommil:all | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | 3. fails to load, falls back to BLAS.javaBLAS in | wrapper for com.github.fommil:core |
| | org.apache.spark.ml.linalg.BLAS | |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Memory API, requires | (JDK16+, relies on the Vector API, requires |
| | --add-modules=jdk.incubator.foreign | --add-modules=jdk.incubator.vector on JDK16) |
| | -Dforeign.restricted=warn) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. fails to load, falls back to BLAS.javaBLAS in | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | org.apache.spark.ml.linalg.BLAS | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | | wrapper for com.github.fommil:core |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617325396
##########
File path: project/SparkBuild.scala
##########
@@ -294,7 +294,7 @@ object SparkBuild extends PomBuild {
javaOptions ++= {
val versionParts = System.getProperty("java.version").split("[+.\\-]+", 3)
var major = versionParts(0).toInt
- if (major >= 16) Seq("--add-modules=jdk.incubator.vector") else Seq.empty
+ if (major >= 16) Seq("--add-modules=jdk.incubator.vector,jdk.incubator.foreign", "-Dforeign.restricted=warn") else Seq.empty
Review comment:
Yes, to use the code relying on the Vector API, one would need to pass `--add-modules=jdk.incubator.vector`. Otherwise, it will fall back to pure JDK8-compatible code. And the `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn` is for a native-wrapper BLAS implementation based on the [Foreign Linker API](https://openjdk.java.net/jeps/389). If that's not available, it automatically falls back to the `com.github.fommil` native wrappers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618183340
##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
<scope>test</scope>
</dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>lapack</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>arpack</artifactId>
+ </dependency>
+
</dependencies>
- <profiles>
Review comment:
I also moved this dependency to https://github.com/apache/spark/pull/32253/files/#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R3489-R3499
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826390233
Jenkins retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823321335
/cc @srowen @zhengruifeng @kiszk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen closed pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen closed pull request #32253:
URL: https://github.com/apache/spark/pull/32253
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827652662
**[Test build #138001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138001/testReport)** for PR 32253 at commit [`17edfeb`](https://github.com/apache/spark/commit/17edfebb92586c77c8806f2a63daaa166f524657).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618494333
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
> I believe we'll need the profile in both places - parent is more about "declaration" and child is more about actually using it.
Ok, let me revert it to how it's in `master`.
> The diff looks OK, but raises the question, what's in arpack, lapack, etc? does that build in native code? I'd have to go check but looks like LAPACK et al are BSD licensed, which is OK.
These only rely on`com.github.fommil.netlib:core` just like `blas`. These do not include any native code, and only provide a wrapper for `com.github.fommil.netlib-java` for now. In the future, I want to provide the same native wrappers as for `blas` but, similarly, without including any actual native code, just the JNI-based wrappers.
> But it looks like nothing references the all artifact from netlib, so not sure what would build in the native code to actually do acceleration, even with the profile? I'm probably missing something.
I am not sure what you don't understand. What do you mean in "__what__ would build in the native code"?
If I understand correctly your question, the native code is wrapped by `com.github.fommil.netlib`. For JDK16+, `dev.ludovic.netlib` does provide a wrapper using the Foreign Linker API, and falls back to using `com.github.fommil.netlib` if the Foreign Linker API is not enabled. In the future, I do want to add a JNI-based wrapper in `dev.ludovic.netlib` directly so that I don't have to rely on `com.github.fommil.netlib` itself.
What I am not sure I understand in the licensing issue is what is licensed as GPL/LGPL? Looking at https://mvnrepository.com/artifact/com.github.fommil.netlib/all/1.1.2, it all seems to be licensed as BSD. So is it the Fortran/native code that's shipped with it which is licensed as GPL and which is an issue?
Thank you again for helping me navigating this problem!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618448295
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
I believe we'll need the profile in both places - parent is more about "declaration" and child is more about actually using it.
The diff looks OK, but raises the question, what's in `arpack`, `lapack`, etc? does that build in native code? I'd have to go check but looks like LAPACK et al are BSD licensed, which is OK.
But it looks like nothing references the `all` artifact from netlib, so not sure what would build in the native code to actually do acceleration, even with the profile? I'm probably missing something.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826378284
**[Test build #137923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137923/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619428947
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
No, the only dependency of interest is `com.github.fommil`. The others are junit for testing, and jmh for benchmarking (in the `dev.ludovic.netlib:benchmarks` package only).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826398299
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42447/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
As updated in the description, the order in which each implementations are used is the following:
```
| | BLAS.nativeBLAS | BLAS.javaBLAS |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| with -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Memory API, requires | (JDK16+, relies on the Vector API, requires |
| | `--add-modules=jdk.incubator.foreign | `--add-modules=jdk.incubator.vector` on JDK16) |
| | -Dforeign.restricted=warn`) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | wrapper for com.github.fommil:all | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | 3. fails to load, falls back to BLAS.javaBLAS in | wrapper for com.github.fommil:core |
| | org.apache.spark.ml.linalg.BLAS | |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Memory API, requires | (JDK16+, relies on the Vector API, requires |
| | `--add-modules=jdk.incubator.foreign | `--add-modules=jdk.incubator.vector` on JDK16) |
| | -Dforeign.restricted=warn`) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. fails to load, falls back to BLAS.javaBLAS in | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | org.apache.spark.ml.linalg.BLAS | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | | wrapper for com.github.fommil:core |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on pull request #32253: Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823181385
I'm adding support for JDK 8 with [v1.0.11](https://github.com/luhenry/netlib/tree/v1.0.11) (Benchmarks at https://github.com/luhenry/netlib/actions/runs/766849718)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619297422
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
> The typical policy is that it's OK to release software that can merely make use of such libraries at runtime (without actually distributing them directly) as long as it doesn't substantially depend on their presence. I believe that dynamic linking in the way you describe is OK - just like having an SPI in JVM code that may be provided by some other GPL code at the user's runtime.
What you describe is exactly how `dev.ludovic.netlib` works. It doesn't substantially depend on OpenBLAS, MKL, or any other native BLAS library to be there as it will fall back to a pure Java implementation otherwise. The transition will be transparent, the feature will be equivalent, only the performance will be affected.
> My main goal is to preserve current behavior.
I fully agree with that, as we don't want to break current behavior nor bring in additional and unwanted dependencies.
> Right now if someone has, say, MKL on their native lib path for the JVM, and built with this alternate profile, it'd be accelerated. If you're saying that still works, but would not require this separate build profile because of the different loading strategy, that's an improvement.
That's exactly how it work with `dev.ludovic.netlib` in JDK16+ today with the implementation based on the Foreign Linker API, and that's how I will want it to work with the JNI-based implementation for JDK8 and JDK11 in the future.
> Have you by chance tried this integration when OpenBLAS is present to verify it makes use of it?
Yes, and it's a lot faster than F2J. The results in https://github.com/apache/spark/pull/32253#issue-619173915 for `native` are with the implementation based on the Foreign Linker API. You can see for `dgemm`, `f2j` is **18-30x slower** than `native` (aka OpenBLAS). I also needed to set `LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu` (on Ubuntu 20.04) for `libblas.so` to be on `ld` path.
To make use of MKL, I only need to set `LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/latest/lib/intel64:/opt/intel/oneapi/compiler/latest/linux/compiler/lib/intel64_lin` and pass `-Ddev.ludovic.netlib.blas.nativeLib=mkl_rt`.
I haven't tried with other BLAS implementations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619327385
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
Today, `dev.ludovic.netlib` does `com.github.fommil.netlib.BLAS.getInstance()` to probe whether the native libraries are installed. (This call is in `com.github.fommil:core` and is thus all good regarding licensing.) If we remove the content of the `netlib-lgpl` profile and the dependency on `com.github.fommil:all` then we will break the current behavior of probing for native BLAS implementation through `com.github.fommil:native_ref` and `com.github.fommil:native_system`.
In the future, I do want to add JNI-based prober/wrapper for BLAS, LAPACK, and ARPACK (and which are compliment with licensing as discussed above), but that is not done yet. Then, we will be able to remove the reference to `com.github.fommil:all`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619432104
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
Dumb question, this all works perfectly fine then without the profile active, without netlib-java `all`? then I think we're OK. If the effect is the same: works without the profile, accelerated with the profile (and presence of OpenBLAS etc) then I think this change is OK.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827636333
Jenkins retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823148693
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619305241
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
OK, it sounds like your solution avoids the small but important nature of the netlib-java shim that made it problematic to include. I think it was libgfortran - the profile didn't itself install BLAS or anything. That is, it wasn't depending on BLAS implementations so much as something in the shim that was linked in. I think.
I don't know this in detail enough to fully evaluate but I believe in your assessment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617099199
##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
<scope>test</scope>
</dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>lapack</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>arpack</artifactId>
+ </dependency>
+
</dependencies>
- <profiles>
Review comment:
What do your libraries pull in that this profile might have? it's disabled on purpose as we can't include what it pulls in in the release itself. (We also probably need to leave the profile empty at least to avoid bulid errors on a missing profile)
##########
File path: pom.xml
##########
@@ -172,6 +172,7 @@
<fasterxml.jackson.version>2.12.2</fasterxml.jackson.version>
<snappy.version>1.1.8.2</snappy.version>
<netlib.java.version>1.1.2</netlib.java.version>
+ <netlib.ludovic.dev.version>1.0.10</netlib.ludovic.dev.version>
Review comment:
(Will need to be 1.1.0 I suppose, when it's ready)
##########
File path: project/SparkBuild.scala
##########
@@ -294,7 +294,7 @@ object SparkBuild extends PomBuild {
javaOptions ++= {
val versionParts = System.getProperty("java.version").split("[+.\\-]+", 3)
var major = versionParts(0).toInt
- if (major >= 16) Seq("--add-modules=jdk.incubator.vector") else Seq.empty
+ if (major >= 16) Seq("--add-modules=jdk.incubator.vector,jdk.incubator.foreign", "-Dforeign.restricted=warn") else Seq.empty
Review comment:
I assume the same, that these flags are required to use the vectorized code, but, that's on someone who's experimenting with Java 16 at this point.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826368715
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42443/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
As updated in the description, the order in which each implementations are used is the following:
```
| | BLAS.nativeBLAS | BLAS.javaBLAS |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| with -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Linker API, requires | (JDK16+, relies on the Vector API, requires |
| | `--add-modules=jdk.incubator.foreign | `--add-modules=jdk.incubator.vector` on JDK16) |
| | -Dforeign.restricted=warn`) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | wrapper for com.github.fommil:all | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | 3. fails to load, falls back to BLAS.javaBLAS in | wrapper for com.github.fommil:core |
| | org.apache.spark.ml.linalg.BLAS | |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Linker API, requires | (JDK16+, relies on the Vector API, requires |
| | `--add-modules=jdk.incubator.foreign | `--add-modules=jdk.incubator.vector` on JDK16) |
| | -Dforeign.restricted=warn`) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. fails to load, falls back to BLAS.javaBLAS in | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | org.apache.spark.ml.linalg.BLAS | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | | wrapper for com.github.fommil:core |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619486088
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
Long story short: yes, it works without the profile active (and is even faster, data in https://github.com/apache/spark/pull/32253#issue-619173915), and is hardware accelerated with the profile or when the Foreign Linker API is active.
Let me document what gets probed and the impact of having the profile in place, I think this is going to clarify many of the questions you've had.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827652662
**[Test build #138001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138001/testReport)** for PR 32253 at commit [`17edfeb`](https://github.com/apache/spark/commit/17edfebb92586c77c8806f2a63daaa166f524657).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827764996
**[Test build #138001 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138001/testReport)** for PR 32253 at commit [`17edfeb`](https://github.com/apache/spark/commit/17edfebb92586c77c8806f2a63daaa166f524657).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
As updated in the description, the order in which each implementations are used is the following:
| | BLAS.javaBLAS | BLAS.nativeBLAS |
| ------------------------ | --------------- | ------------------- |
| **with** -Pnetlib-lgpl | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br> 3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core` | 1. `dev.ludovic.netlib.blas.NetlibNativeBLAS`, a wrapper for `com.github.fommil:all`<br>2. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`) <br>3. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS` |
| **without** -Pnetlib-lgpl | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br>3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core` | 1. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`)<br>2. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS` |
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619298148
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
> (One tiny side note, I think your parent POM still references GPLv2 though the license text does not)
I have fixed that in the latest master. I will push it for Spark right now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619327385
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
Today, `dev.ludovic.netlib` does call `com.github.fommil.netlib.BLAS.getInstance()` to probe whether the native libraries are installed. (This call is in `com.github.fommil:core` and is thus all good regarding licensing.) If we remove the content of the `netlib-lgpl` profile and the dependency on `com.github.fommil:all` then we will break the current behavior of probing for native BLAS implementation through `com.github.fommil:native_ref` and `com.github.fommil:native_system`.
In the future, I do want to add JNI-based prober/wrapper for BLAS, LAPACK, and ARPACK (and which are compliment with licensing as discussed above), but that is not done yet. Then, we will be able to remove the reference to `com.github.fommil:all`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r620512566
##########
File path: dev/deps/spark-deps-hadoop-3.2-hive-2.3
##########
@@ -10,6 +10,7 @@ annotations/17.0.0//annotations-17.0.0.jar
antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
antlr4-runtime/4.8-1//antlr4-runtime-4.8-1.jar
aopalliance-repackaged/2.6.1//aopalliance-repackaged-2.6.1.jar
+arpack/1.3.2//arpack-1.3.2.jar
Review comment:
Added with https://github.com/apache/spark/pull/32253/commits/17edfebb92586c77c8806f2a63daaa166f524657
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618164534
##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
<scope>test</scope>
</dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>lapack</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>arpack</artifactId>
+ </dependency>
+
</dependencies>
- <profiles>
Review comment:
I have removed the dependency on `com.github.fommil.netlib:all` in `dev.ludovic.netlib`, and replaced it with `com.github.fommil.netlib:core` per your previous message. It's then still up to the Spark user to specify `-Pnetlib-lgpl` to get the native wrapper in `com.github.fommil.netlib` or not.
In the longer term, I'll add JNI-based native wrappers to `dev.ludovic.netlib` so that we wouldn't have to rely on `com.github.fommil.netlib:all` at all. These native wrappers would only depend on a native BLAS implementation (OpenBLAS or Intel MKL for example) to be already installed on the machine, but wouldn't ship any GPL/LGPL code.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826399282
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42447/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
As updated in the description, the order in which things are used is the following:
| | BLAS.javaBLAS | BLAS.nativeBLAS |
| ------------------------ | --------------- | ------------------- |
| **with** -Pnetlib-lgpl | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br> 3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core` | 1. `dev.ludovic.netlib.blas.NetlibNativeBLAS`, a wrapper for `com.github.fommil:all`<br>2. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`) <br>3. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS` |
| **without** -Pnetlib-lgpl | 1. `dev.ludovic.netlib.blas.VectorizedBLAS` (JDK16+, relies on the Vector API, requires `--add-modules=jdk.incubator.vector` on JDK16)<br>2. `dev.ludovic.netlib.blas.Java11BLAS` (JDK11+)<br>3. `dev.ludovic.netlib.blas.JavaBLAS`<br>4. `dev.ludovic.netlib.blas.NetlibF2jBLAS`, a wrapper for `com.github.fommil:core` | 1. `dev.ludovic.netlib.blas.ForeignBLAS` (JDK16+, relies on the Foreign Memory API, requires `--add-modules=jdk.incubator.foreign -Dforeign.restricted=warn`)<br>2. fails to load, falls back to BLAS.javaBLAS in `org.apache.spark.ml.linalg.BLAS` |
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617519509
##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
<scope>test</scope>
</dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>lapack</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>arpack</artifactId>
+ </dependency>
+
</dependencies>
- <profiles>
Review comment:
I believe that netlib has two parts, the core Java interface, and a fuller package including LGPL components. I can work with only the core - falls back to JVM code. The artifacts behind the profile are the complete "all" artifact. I think we'd have to follow the same. Anything importing "all" needs to be behind the profile. You can check the result by running `dev/test-dependencies.sh --replace-manifest` (which you'll need to do anyway) to see if the changes in the transitive dependencies have included the "all" artifact.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827781911
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138001/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r620509994
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
Review comment:
There is currently no uses for ARPACK nor LAPACK in mllib-local or graphx. The only place it's used is in `org.apache.spark.mllib.linalg.CholeskyDecomposition` and `org.apache.spark.mllib.linalg.EigenValueDecomposition`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827697454
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-823148693
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
As updated in the description, the order in which each implementations are used is the following:
```
| | BLAS.nativeBLAS | BLAS.javaBLAS |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| with -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.NetlibNativeBLAS, a | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | wrapper for com.github.fommil:all | (JDK16+, relies on the Vector API, requires |
| | 2. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | `--add-modules=jdk.incubator.vector` on JDK16) |
| | relies on the Foreign Linker API, requires | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | `--add-modules=jdk.incubator.foreign | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | -Dforeign.restricted=warn`) | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | 3. fails to load, falls back to BLAS.javaBLAS in | wrapper for com.github.fommil:core |
| | org.apache.spark.ml.linalg.BLAS | |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Linker API, requires | (JDK16+, relies on the Vector API, requires |
| | `--add-modules=jdk.incubator.foreign | `--add-modules=jdk.incubator.vector` on JDK16) |
| | -Dforeign.restricted=warn`) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. fails to load, falls back to BLAS.javaBLAS in | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | org.apache.spark.ml.linalg.BLAS | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | | wrapper for com.github.fommil:core |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826368715
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826888681
The PytSpark unit tests have passed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826353943
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617343950
##########
File path: mllib/pom.xml
##########
@@ -142,20 +142,20 @@
<scope>test</scope>
</dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>lapack</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>arpack</artifactId>
+ </dependency>
+
</dependencies>
- <profiles>
Review comment:
It only pulls `com.github.fommil.netlib`. The long term goal is to get rid of this dependency but it's going to require quite a lot of work still.
What I don't understand is how can Spark work today if `com.github.fommil.netlib` is not installed? Are there other packages that can be dropped in place and which can seamlessly replace it? `com.github.fommil.netlib` is unconditionally added as a dependency in [graphx/pom.xml](https://github.com/apache/spark/blob/master/graphx/pom.xml#L63), and that is why Spark functions even when `-Pnetlib-lgpl` isn't defined. (The package is going to be downloaded, the jar is going to be passed to the compiler, and it's going to be on the classpath when running tests and benchmarks).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826353943
Jenkins retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619306203
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
So if I undestand right, you can remove the `all` artifact below in the profile then? (But keep the profile as empty, with a comment that it's no longer used.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619215551
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
That's right. The ASF needs to release artifacts that are as or no more restrictive than ALv2, and that ends mostly meaning: no binary artifacts with copy-left or field-use-of-restricted licenses like the GPL family. It does turn on releasing a copy of code, or in the case of GPL, more about linking to it.
The typical policy is that it's OK to release software that can merely make use of such libraries at runtime (without actually distributing them directly) as long as it doesn't substantially depend on their presence. I believe that dynamic linking in the way you describe is OK - just like having an SPI in JVM code that may be provided by some other GPL code at the user's runtime.
My main goal is to preserve current behavior. Right now if someone has, say, MKL on their native lib path for the JVM, and built with this alternate profile, it'd be accelerated. If you're saying that still works, but would not require this separate build profile because of the different loading strategy, that's an improvement.
If nothing that ends up in transitive dependencies can be called statically linked to, say, libgfortran or anything GPL/LGPL, I think this is fine.
Have you by chance tried this integration when OpenBLAS is present to verify it makes use of it?
(One tiny side note, I think your parent POM still references GPLv2 though the license text does not)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827021079
### JDK8:
```
[info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz
[info]
[info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS
[info] javaBLAS = dev.ludovic.netlib.blas.Java8BLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.Java8BLAS
[info]
[info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 223 232 8 448.0 2.2 1.0X
[info] java 221 228 7 453.0 2.2 1.0X
[info]
[info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 122 128 4 821.2 1.2 1.0X
[info] java 122 128 4 822.3 1.2 1.0X
[info]
[info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 109 112 2 921.4 1.1 1.0X
[info] java 70 74 3 1423.5 0.7 1.5X
[info]
[info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 96 98 2 1046.1 1.0 1.0X
[info] java 47 49 2 2121.7 0.5 2.0X
[info]
[info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 184 195 8 544.3 1.8 1.0X
[info] java 185 196 7 539.5 1.9 1.0X
[info]
[info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 99 104 4 1011.9 1.0 1.0X
[info] java 99 104 4 1010.4 1.0 1.0X
[info]
[info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 947.2 1.1 1.0X
[info] java 0 0 0 1584.8 0.6 1.7X
[info]
[info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 867.4 1.2 1.0X
[info] java 1 1 0 865.0 1.2 1.0X
[info]
[info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 485.9 2.1 1.0X
[info] java 1 1 0 486.8 2.1 1.0X
[info]
[info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1843.0 0.5 1.0X
[info] java 0 0 0 2690.6 0.4 1.5X
[info]
[info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1214.7 0.8 1.0X
[info] java 0 0 0 2536.8 0.4 2.1X
[info]
[info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1895.9 0.5 1.0X
[info] java 0 0 0 2961.1 0.3 1.6X
[info]
[info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1223.4 0.8 1.0X
[info] java 0 0 0 3091.4 0.3 2.5X
[info]
[info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 560 575 20 1787.1 0.6 1.0X
[info] java 226 232 5 4432.4 0.2 2.5X
[info]
[info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 570 586 23 1755.2 0.6 1.0X
[info] java 227 232 4 4410.1 0.2 2.5X
[info]
[info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 863 879 17 1158.4 0.9 1.0X
[info] java 227 231 3 4407.9 0.2 3.8X
[info]
[info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1282 1305 23 780.0 1.3 1.0X
[info] java 227 232 4 4413.4 0.2 5.7X
[info]
[info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 538 548 8 1858.6 0.5 1.0X
[info] java 221 226 3 4521.1 0.2 2.4X
[info]
[info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 549 558 10 1819.9 0.5 1.0X
[info] java 222 229 7 4503.5 0.2 2.5X
[info]
[info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 838 852 12 1193.0 0.8 1.0X
[info] java 222 229 5 4500.5 0.2 3.8X
[info]
[info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 905 919 18 1104.8 0.9 1.0X
[info] java 221 228 5 4521.3 0.2 4.1X
```
### JDK11:
```
[info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz
[info]
[info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS
[info] javaBLAS = dev.ludovic.netlib.blas.Java11BLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.Java11BLAS
[info]
[info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 195 204 10 512.7 2.0 1.0X
[info] java 195 202 7 512.4 2.0 1.0X
[info]
[info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 108 113 4 923.3 1.1 1.0X
[info] java 102 107 4 984.4 1.0 1.1X
[info]
[info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 107 110 3 938.1 1.1 1.0X
[info] java 69 72 3 1447.1 0.7 1.5X
[info]
[info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 96 98 2 1046.5 1.0 1.0X
[info] java 43 45 2 2317.1 0.4 2.2X
[info]
[info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 155 168 8 644.2 1.6 1.0X
[info] java 158 169 8 632.8 1.6 1.0X
[info]
[info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 85 90 4 1178.1 0.8 1.0X
[info] java 86 90 4 1167.7 0.9 1.0X
[info]
[info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 0 0 0 1182.1 0.8 1.0X
[info] java 0 0 0 1432.1 0.7 1.2X
[info]
[info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 898.7 1.1 1.0X
[info] java 1 1 0 891.5 1.1 1.0X
[info]
[info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 495.4 2.0 1.0X
[info] java 1 1 0 495.7 2.0 1.0X
[info]
[info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 0 0 0 2271.6 0.4 1.0X
[info] java 0 0 0 3648.1 0.3 1.6X
[info]
[info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1229.3 0.8 1.0X
[info] java 0 0 0 2711.3 0.4 2.2X
[info]
[info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 0 0 0 2677.5 0.4 1.0X
[info] java 0 0 0 3288.2 0.3 1.2X
[info]
[info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1233.0 0.8 1.0X
[info] java 0 0 0 2766.3 0.4 2.2X
[info]
[info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 520 536 16 1923.6 0.5 1.0X
[info] java 214 221 7 4669.5 0.2 2.4X
[info]
[info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 593 612 17 1686.5 0.6 1.0X
[info] java 215 219 3 4643.3 0.2 2.8X
[info]
[info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 853 870 16 1172.8 0.9 1.0X
[info] java 215 218 3 4659.7 0.2 4.0X
[info]
[info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1350 1370 23 740.8 1.3 1.0X
[info] java 215 219 4 4656.6 0.2 6.3X
[info]
[info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 460 468 6 2173.2 0.5 1.0X
[info] java 210 213 2 4752.7 0.2 2.2X
[info]
[info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 535 544 8 1869.3 0.5 1.0X
[info] java 210 215 5 4761.8 0.2 2.5X
[info]
[info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 843 853 11 1186.8 0.8 1.0X
[info] java 209 214 4 4793.4 0.2 4.0X
[info]
[info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 891 904 15 1122.0 0.9 1.0X
[info] java 209 214 4 4777.2 0.2 4.3X
```
### JDK16:
```
[info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz
[info]
[info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS
[info] javaBLAS = dev.ludovic.netlib.blas.VectorizedBLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.VectorizedBLAS
[info]
[info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 194 199 7 515.7 1.9 1.0X
[info] java 181 186 3 551.1 1.8 1.1X
[info]
[info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 109 115 4 915.0 1.1 1.0X
[info] java 88 92 3 1138.8 0.9 1.2X
[info]
[info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 108 110 2 922.6 1.1 1.0X
[info] java 54 56 2 1839.2 0.5 2.0X
[info]
[info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 96 97 2 1046.1 1.0 1.0X
[info] java 29 30 1 3393.4 0.3 3.2X
[info]
[info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 156 165 5 643.0 1.6 1.0X
[info] java 150 159 5 667.1 1.5 1.0X
[info]
[info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 85 91 6 1171.0 0.9 1.0X
[info] java 75 79 3 1340.6 0.7 1.1X
[info]
[info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 917.0 1.1 1.0X
[info] java 0 0 0 8147.2 0.1 8.9X
[info]
[info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 859.3 1.2 1.0X
[info] java 1 1 0 859.3 1.2 1.0X
[info]
[info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 482.1 2.1 1.0X
[info] java 1 1 0 482.6 2.1 1.0X
[info]
[info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 0 0 0 2214.2 0.5 1.0X
[info] java 0 0 0 7975.8 0.1 3.6X
[info]
[info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1231.4 0.8 1.0X
[info] java 0 0 0 8680.9 0.1 7.0X
[info]
[info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 0 0 0 2684.3 0.4 1.0X
[info] java 0 0 0 18527.1 0.1 6.9X
[info]
[info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1 1 0 1235.4 0.8 1.0X
[info] java 0 0 0 17347.9 0.1 14.0X
[info]
[info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 530 552 18 1887.5 0.5 1.0X
[info] java 58 64 3 17143.9 0.1 9.1X
[info]
[info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 598 620 17 1671.1 0.6 1.0X
[info] java 58 64 3 17196.6 0.1 10.3X
[info]
[info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 834 847 14 1199.4 0.8 1.0X
[info] java 57 63 4 17486.9 0.1 14.6X
[info]
[info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 1338 1366 22 747.3 1.3 1.0X
[info] java 58 63 3 17356.6 0.1 23.2X
[info]
[info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 489 501 9 2045.5 0.5 1.0X
[info] java 36 38 2 27721.9 0.0 13.6X
[info]
[info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 478 488 9 2094.0 0.5 1.0X
[info] java 36 38 2 27813.2 0.0 13.3X
[info]
[info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 825 837 10 1211.6 0.8 1.0X
[info] java 35 38 2 28433.1 0.0 23.5X
[info]
[info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 900 918 15 1111.6 0.9 1.0X
[info] java 36 38 2 28073.0 0.0 25.3X
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827781911
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138001/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826423757
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137927/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827699944
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42521/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618549812
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
First some background - I think the issue with licensing comes from the submodules that are pulled in by `all`, like: https://github.com/fommil/netlib-java/blob/master/native_ref/xbuilds/linux-x86_64/pom.xml netlib-java generates and compiles native code. So things like https://mvnrepository.com/artifact/com.github.fommil.netlib/netlib-native_system-linux-x86_64/1.1 ship a `.so` and I believe it's that code or what it links to that is LGPL. I've honestly forgotten the details but seemed pretty clear (it may be, at least, libgfortran). At least, I recall going over this with a fine-toothed comb with Cloudera's legal department and they agreed, FWIW.
Anyway, that's the reason behind all this bother with profiles. The wrapper in netlib-java is OK to include in distributions and compile against. But adding in the dependencies that bring in native code is not. Hence it's under a profile that downstream users can turn on for their own builds if desired.
So, my question is, where is the native code coming in here to do the acceleration? I understand Java 16+ has it built in, that's the whole point of your change, but we still want to let people flip on dependencies that will build in bindings to OpenBLAS, etc. Where would that be now? because I see you're just depending on netlib-java `core`, which doesn't have that. I'm asking if that's what `blas` is, etc, but not sure that's it either.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392006
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137923/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618494333
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
> I believe we'll need the profile in both places - parent is more about "declaration" and child is more about actually using it.
Ok, let me revert it to how it's in `master`.
> The diff looks OK, but raises the question, what's in arpack, lapack, etc? does that build in native code? I'd have to go check but looks like LAPACK et al are BSD licensed, which is OK.
These only rely on`com.github.fommil.netlib:core` just like `blas`. These do not include any native code, and only provide a wrapper for `com.github.fommil.netlib-java` for now. In the future, I want to provide the same native wrappers as for `blas` but, similarly, without including any actual native code, just the JNI-based wrappers.
> But it looks like nothing references the all artifact from netlib, so not sure what would build in the native code to actually do acceleration, even with the profile? I'm probably missing something.
I am not sure what you don't understand. What do you mean in "__what__ would build in the native code"?
If I understand correctly your question, the native code is wrapped by `com.github.fommil.netlib` for JDK8 and JDK 11. For JDK16, `dev.ludovic.netlib` does provide a wrapper using the Foreign Linker API, and falls back to using `com.github.fommil.netlib` if the Foreign Linker API is not enabled. In the future, I do want to add a JNI-based wrapper in `dev.ludovic.netlib` directly so that I don't have to rely on `com.github.fommil.netlib` itself.
What I am not sure I understand in the licensing issue is what is licensed as GPL/LGPL? Looking at https://mvnrepository.com/artifact/com.github.fommil.netlib/all/1.1.2, it all seems to be licensed as BSD. So is it the Fortran/native code that's shipped with it which is licensed as GPL and which is an issue?
Thank you again for helping me navigating this problem!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826364400
Kubernetes integration test unable to build dist.
exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42443/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617071851
##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
<artifactId>guava</artifactId>
</dependency>
<dependency>
- <groupId>com.github.fommil.netlib</groupId>
- <artifactId>core</artifactId>
- <version>${netlib.java.version}</version>
+ <groupId>dev.ludovic.netlib</groupId>
Review comment:
[v1.1.0](https://github.com/luhenry/netlib/tree/v1.1.0) relicenses to MIT.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r616916954
##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
<artifactId>guava</artifactId>
</dependency>
<dependency>
- <groupId>com.github.fommil.netlib</groupId>
- <artifactId>core</artifactId>
- <version>${netlib.java.version}</version>
+ <groupId>dev.ludovic.netlib</groupId>
Review comment:
Is your project GPL? that won't fly in an ASF project I'm afraid if so. (The relationship to the netlib code is already quite tricky w.r.t. licensing)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919
**[Test build #137923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137923/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry edited a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry edited a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826888681
The PytSpark unit tests have passed. I'm running BLASBenchmark locally with JDK8, JDK11, and JDK16, and I'll paste the results here as soon as it's done (~1-2 hours).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619494235
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
As updated in the description, the order in which each implementations are used is the following:
```
| | BLAS.nativeBLAS | BLAS.javaBLAS |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| with -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Memory API, requires | (JDK16+, relies on the Vector API, requires |
| | `--add-modules=jdk.incubator.foreign | `--add-modules=jdk.incubator.vector` on JDK16) |
| | -Dforeign.restricted=warn`) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. dev.ludovic.netlib.blas.NetlibNativeBLAS, a | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | wrapper for com.github.fommil:all | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | 3. fails to load, falls back to BLAS.javaBLAS in | wrapper for com.github.fommil:core |
| | org.apache.spark.ml.linalg.BLAS | |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, | 1. dev.ludovic.netlib.blas.VectorizedBLAS |
| | relies on the Foreign Memory API, requires | (JDK16+, relies on the Vector API, requires |
| | `--add-modules=jdk.incubator.foreign | `--add-modules=jdk.incubator.vector` on JDK16) |
| | -Dforeign.restricted=warn`) | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) |
| | 2. fails to load, falls back to BLAS.javaBLAS in | 3. dev.ludovic.netlib.blas.JavaBLAS |
| | org.apache.spark.ml.linalg.BLAS | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a |
| | | wrapper for com.github.fommil:core |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r618439810
##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
</dependencies>
- <profiles>
- <profile>
- <id>netlib-lgpl</id>
Review comment:
Is it enough to have this profile in `pom.xml` ([like so](https://github.com/apache/spark/pull/32253/files/#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R3489-R3499)) or does it need to be in this `mllib-local/pom.xml`?
The results from `dev/test-dependencies.sh --replace-manifest` __without__ `-Pnetlib-lgpl` are at https://gist.github.com/luhenry/ceb64c52ad63c69ed0ef2aa205ddde25. The only added dependencies are the blas, lapack, and arpack packages from `dev.ludovic.netlib`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619357668
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
I see, so `all` is still needed, and needs to be behind the profile. While not required for license reasons, do any other of your dependencies only makes sense when paired with the `all` artifact, and so should be in the profile too?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617003887
##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
<artifactId>guava</artifactId>
</dependency>
<dependency>
- <groupId>com.github.fommil.netlib</groupId>
- <artifactId>core</artifactId>
- <version>${netlib.java.version}</version>
+ <groupId>dev.ludovic.netlib</groupId>
Review comment:
It is currently GPLv2 with Classpath Exception, the same as the OpenJDK. I'm open to changing the licensing if it's a blocking issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392006
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137923/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392271
**[Test build #137927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137927/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827844059
Merged to master. Thanks, this is a step forward in better acceleration for sure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919
**[Test build #137923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137923/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826392271
**[Test build #137927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137927/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826414322
**[Test build #137927 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137927/testReport)** for PR 32253 at commit [`478d3ad`](https://github.com/apache/spark/commit/478d3ada887f50b0937b87d3a0f7a7c043d25b63).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r617071851
##########
File path: graphx/pom.xml
##########
@@ -60,9 +60,8 @@
<artifactId>guava</artifactId>
</dependency>
<dependency>
- <groupId>com.github.fommil.netlib</groupId>
- <artifactId>core</artifactId>
- <version>${netlib.java.version}</version>
+ <groupId>dev.ludovic.netlib</groupId>
Review comment:
[v1.1.0](https://github.com/luhenry/netlib/tree/v1.1.0) relicenses to MIT. I'll update this PR as soon as it's released (benchmarks take ~12 hours to run and I require them for a release).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-826423757
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137927/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32253:
URL: https://github.com/apache/spark/pull/32253#issuecomment-827699944
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42521/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] luhenry commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619327385
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
+ </dependency>
Review comment:
Today, `dev.ludovic.netlib` does call `com.github.fommil.netlib.BLAS.getInstance()` to probe whether the native libraries are installed. (This call is in `com.github.fommil:core` and is thus all good regarding licensing.) If we remove the content of the `netlib-lgpl` profile and the dependency on `com.github.fommil:all` then we will break the current behavior of probing for native BLAS implementation through `com.github.fommil:native_ref` and `com.github.fommil:native_system` when the user enables `-Pnetlib-lgpl`.
In the future, I do want to add JNI-based prober/wrapper for BLAS, LAPACK, and ARPACK (and which are compliment with licensing as discussed above), but that is not done yet. Then, we will be able to remove the reference to `com.github.fommil:all`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r620373383
##########
File path: dev/deps/spark-deps-hadoop-3.2-hive-2.3
##########
@@ -10,6 +10,7 @@ annotations/17.0.0//annotations-17.0.0.jar
antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
antlr4-runtime/4.8-1//antlr4-runtime-4.8-1.jar
aopalliance-repackaged/2.6.1//aopalliance-repackaged-2.6.1.jar
+arpack/1.3.2//arpack-1.3.2.jar
Review comment:
There's one more minor thing to do here. Under licenses-binary/ , include a file like `LICENSE-blas.txt` and/or separate ones for arpack, lapack artifacts as you wish (name doesn't matter, doesn't have to be 3 - really it's whatever satisifies you, as long as we have a copy that's connectable to your artifacts) with the text of the MIT license file. In `LICENSE-binary`, you'll see a list of MIT dependencies. Just add them in there too.
I don't think any dependencies ultimately went away so there's nothing to remove, AFAICT.
It's a formality but just ensures the license requirements are buttoned up.
##########
File path: mllib-local/pom.xml
##########
@@ -75,6 +75,11 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>dev.ludovic.netlib</groupId>
+ <artifactId>blas</artifactId>
Review comment:
So graphx and mllib-local would not need ARPACK, LAPACK?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org