You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by hv...@apache.org on 2023/02/16 18:26:27 UTC

[spark] branch branch-3.4 updated: [SPARK-42287][CONNECT][BUILD] Fix shading so that the JVM client jar can include all 3rd-party dependencies in the runtime

This is an automated email from the ASF dual-hosted git repository.

hvanhovell pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
     new 130c82c8c5a [SPARK-42287][CONNECT][BUILD] Fix shading so that the JVM client jar can include all 3rd-party dependencies in the runtime
130c82c8c5a is described below

commit 130c82c8c5aa56bc1ed94553415225cfd25fe4c0
Author: Zhen Li <zh...@users.noreply.github.com>
AuthorDate: Thu Feb 16 14:26:01 2023 -0400

    [SPARK-42287][CONNECT][BUILD] Fix shading so that the JVM client jar can include all 3rd-party dependencies in the runtime
    
    ### What changes were proposed in this pull request?
    Fix the JVM client dependencies and shading result.
    The common jar should not be shaded. The shading should be done in client or server.
    The common jar shall depends on minimal dependency if possible.
    The client jar should provides all compiled dependencies out of the box, including netty etc.
    The client sbt and mvn shall produce the same shading result.
    
    The current client dependency summary:
    ```
    [INFO] --- maven-dependency-plugin:3.3.0:tree (default-cli)  spark-connect-client-jvm_2.12 ---
    [INFO] org.apache.spark:spark-connect-client-jvm_2.12:jar:3.5.0-SNAPSHOT
    [INFO] +- org.apache.spark:spark-connect-common_2.12:jar:3.5.0-SNAPSHOT:compile
    [INFO] |  +- org.scala-lang:scala-library:jar:2.12.17:compile
    [INFO] |  +- io.grpc:grpc-netty:jar:1.47.0:compile
    [INFO] |  |  +- io.grpc:grpc-core:jar:1.47.0:compile
    [INFO] |  |  |  +- com.google.code.gson:gson:jar:2.9.0:runtime
    [INFO] |  |  |  +- com.google.android:annotations:jar:4.1.1.4:runtime
    [INFO] |  |  |  \- org.codehaus.mojo:animal-sniffer-annotations:jar:1.19:runtime
    [INFO] |  |  \- io.perfmark:perfmark-api:jar:0.25.0:runtime
    [INFO] |  +- io.grpc:grpc-protobuf:jar:1.47.0:compile
    [INFO] |  |  +- io.grpc:grpc-api:jar:1.47.0:compile
    [INFO] |  |  |  \- io.grpc:grpc-context:jar:1.47.0:compile
    [INFO] |  |  +- com.google.api.grpc:proto-google-common-protos:jar:2.0.1:compile
    [INFO] |  |  \- io.grpc:grpc-protobuf-lite:jar:1.47.0:compile
    [INFO] |  +- io.grpc:grpc-services:jar:1.47.0:compile
    [INFO] |  |  \- com.google.protobuf:protobuf-java-util:jar:3.19.2:runtime
    [INFO] |  \- io.grpc:grpc-stub:jar:1.47.0:compile
    [INFO] +- com.google.protobuf:protobuf-java:jar:3.21.12:compile
    [INFO] +- com.google.guava:guava:jar:31.0.1-jre:compile
    [INFO] |  +- com.google.guava:failureaccess:jar:1.0.1:compile
    [INFO] |  +- com.google.guava:listenablefuture:jar:9999.0-empty-to-avoid-conflict-with-guava:compile
    [INFO] |  +- com.google.code.findbugs:jsr305:jar:3.0.0:compile
    [INFO] |  +- org.checkerframework:checker-qual:jar:3.12.0:compile
    [INFO] |  +- com.google.errorprone:error_prone_annotations:jar:2.7.1:compile
    [INFO] |  \- com.google.j2objc:j2objc-annotations:jar:1.3:compile
    [INFO] +- io.netty:netty-codec-http2:jar:4.1.87.Final:compile
    [INFO] |  +- io.netty:netty-common:jar:4.1.87.Final:compile
    [INFO] |  +- io.netty:netty-buffer:jar:4.1.87.Final:compile
    [INFO] |  +- io.netty:netty-transport:jar:4.1.87.Final:compile
    [INFO] |  |  \- io.netty:netty-resolver:jar:4.1.87.Final:compile
    [INFO] |  +- io.netty:netty-codec:jar:4.1.87.Final:compile
    [INFO] |  +- io.netty:netty-handler:jar:4.1.87.Final:compile
    [INFO] |  \- io.netty:netty-codec-http:jar:4.1.87.Final:compile
    [INFO] +- io.netty:netty-handler-proxy:jar:4.1.87.Final:compile
    [INFO] |  \- io.netty:netty-codec-socks:jar:4.1.87.Final:compile
    [INFO] +- io.netty:netty-transport-native-unix-common:jar:4.1.87.Final:compile
    [INFO] +- org.spark-project.spark:unused:jar:1.0.0:compile
    ```
    
    ### Why are the changes needed?
    
    Fix the client jar package.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Existing tests.
    
    Closes #39866 from zhenlineo/fix-jars.
    
    Authored-by: Zhen Li <zh...@users.noreply.github.com>
    Signed-off-by: Herman van Hovell <he...@databricks.com>
    (cherry picked from commit 49af23aa87d7a566e16c81afbcfb49c0e2064536)
    Signed-off-by: Herman van Hovell <he...@databricks.com>
---
 connector/connect/client/jvm/pom.xml | 96 +++++++++++++++++++++++-------------
 connector/connect/common/pom.xml     | 48 ------------------
 project/SparkBuild.scala             | 14 ++++--
 3 files changed, 72 insertions(+), 86 deletions(-)

diff --git a/connector/connect/client/jvm/pom.xml b/connector/connect/client/jvm/pom.xml
index 593819f7108..a3acf046dbc 100644
--- a/connector/connect/client/jvm/pom.xml
+++ b/connector/connect/client/jvm/pom.xml
@@ -33,6 +33,7 @@
   <properties>
     <sbt.project.name>connect-client-jvm</sbt.project.name>
     <guava.version>31.0.1-jre</guava.version>
+    <guava.failureaccess.version>1.0.1</guava.failureaccess.version>
     <mima.version>1.1.0</mima.version>
   </properties>
 
@@ -53,6 +54,13 @@
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-catalyst_${scala.binary.version}</artifactId>
       <version>${project.version}</version>
+      <scope>provided</scope>
+      <exclusions>
+        <exclusion>
+          <groupId>com.google.guava</groupId>
+          <artifactId>guava</artifactId>
+        </exclusion>
+      </exclusions>
     </dependency>
     <dependency>
       <groupId>com.google.protobuf</groupId>
@@ -66,22 +74,26 @@
       <version>${guava.version}</version>
       <scope>compile</scope>
     </dependency>
-    <!--
-      SPARK-42213: Add `repl` as test dependency to solve the issue that
-      `ClientE2ETestSuite` cannot find `repl.Main` when maven test.
-    -->
     <dependency>
-      <groupId>org.apache.spark</groupId>
-      <artifactId>spark-repl_${scala.binary.version}</artifactId>
-      <version>${project.version}</version>
-      <scope>test</scope>
-      <!-- Exclude sql module to resolve compilation conflicts -->
-      <exclusions>
-        <exclusion>
-          <groupId>org.apache.spark</groupId>
-          <artifactId>spark-sql_${scala.binary.version}</artifactId>
-        </exclusion>
-      </exclusions>
+      <groupId>com.google.guava</groupId>
+      <artifactId>failureaccess</artifactId>
+      <version>${guava.failureaccess.version}</version>
+      <scope>compile</scope>
+    </dependency>
+    <dependency>
+      <groupId>io.netty</groupId>
+      <artifactId>netty-codec-http2</artifactId>
+      <version>${netty.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>io.netty</groupId>
+      <artifactId>netty-handler-proxy</artifactId>
+      <version>${netty.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>io.netty</groupId>
+      <artifactId>netty-transport-native-unix-common</artifactId>
+      <version>${netty.version}</version>
     </dependency>
     <dependency>
       <groupId>org.apache.spark</groupId>
@@ -117,7 +129,8 @@
   <build>
     <testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
     <plugins>
-      <!-- Shade all Guava / Protobuf dependencies of this build -->
+      <!-- Shade all Guava / Protobuf / Netty dependencies of this build -->
+      <!-- TODO (SPARK-42449): Ensure shading rules are handled correctly in `native-image.properties` and support GraalVM   -->
       <plugin>
         <groupId>org.apache.maven.plugins</groupId>
         <artifactId>maven-shade-plugin</artifactId>
@@ -125,40 +138,57 @@
           <shadedArtifactAttached>false</shadedArtifactAttached>
           <artifactSet>
             <includes>
+              <include>com.google.android:*</include>
+              <include>com.google.api.grpc:*</include>
+              <include>com.google.code.findbugs:*</include>
+              <include>com.google.code.gson:*</include>
+              <include>com.google.errorprone:*</include>
               <include>com.google.guava:*</include>
-              <include>io.grpc:*</include>
+              <include>com.google.j2objc:*</include>
               <include>com.google.protobuf:*</include>
+              <include>io.grpc:*</include>
+              <include>io.netty:*</include>
+              <include>io.perfmark:*</include>
+              <include>org.codehaus.mojo:*</include>
+              <include>org.checkerframework:*</include>
               <include>org.apache.spark:spark-connect-common_${scala.binary.version}</include>
             </includes>
           </artifactSet>
           <relocations>
             <relocation>
               <pattern>io.grpc</pattern>
-              <shadedPattern>${spark.shade.packageName}.connect.client.grpc</shadedPattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.io.grpc</shadedPattern>
               <includes>
                 <include>io.grpc.**</include>
               </includes>
             </relocation>
             <relocation>
-              <pattern>com.google.protobuf</pattern>
-              <shadedPattern>${spark.shade.packageName}.connect.protobuf</shadedPattern>
-              <includes>
-                <include>com.google.protobuf.**</include>
-              </includes>
+              <pattern>com.google</pattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.com.google</shadedPattern>
             </relocation>
             <relocation>
-              <pattern>com.google.common</pattern>
-              <shadedPattern>${spark.shade.packageName}.connect.client.guava</shadedPattern>
-              <includes>
-                <include>com.google.common.**</include>
-              </includes>
+              <pattern>io.netty</pattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.io.netty</shadedPattern>
             </relocation>
             <relocation>
-              <pattern>com.google.thirdparty</pattern>
-              <shadedPattern>${spark.shade.packageName}.connect.client.guava</shadedPattern>
-              <includes>
-                <include>com.google.thirdparty.**</include>
-              </includes>
+              <pattern>org.checkerframework</pattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.org.checkerframework</shadedPattern>
+            </relocation>
+            <relocation>
+              <pattern>javax.annotation</pattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.javax.annotation</shadedPattern>
+            </relocation>
+            <relocation>
+              <pattern>io.perfmark</pattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.io.perfmark</shadedPattern>
+            </relocation>
+            <relocation>
+              <pattern>org.codehaus</pattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.org.codehaus</shadedPattern>
+            </relocation>
+            <relocation>
+              <pattern>android.annotation</pattern>
+              <shadedPattern>${spark.shade.packageName}.connect.client.android.annotation</shadedPattern>
             </relocation>
           </relocations>
           <!--SPARK-42228: Add `ServicesResourceTransformer` to relocation class names in META-INF/services for grpc-->
diff --git a/connector/connect/common/pom.xml b/connector/connect/common/pom.xml
index dfa6f656c7c..685bc054409 100644
--- a/connector/connect/common/pom.xml
+++ b/connector/connect/common/pom.xml
@@ -54,17 +54,6 @@
             <groupId>org.scala-lang</groupId>
             <artifactId>scala-library</artifactId>
         </dependency>
-        <dependency>
-            <groupId>com.google.guava</groupId>
-            <artifactId>guava</artifactId>
-            <version>${guava.version}</version>
-            <scope>compile</scope>
-        </dependency>
-        <dependency>
-            <groupId>com.google.guava</groupId>
-            <artifactId>failureaccess</artifactId>
-            <version>${guava.failureaccess.version}</version>
-        </dependency>
         <dependency>
             <groupId>com.google.protobuf</groupId>
             <artifactId>protobuf-java</artifactId>
@@ -168,43 +157,6 @@
                     </execution>
                 </executions>
             </plugin>
-            <!-- Shade all GRPC / Guava / Protobuf dependencies of this build -->
-            <plugin>
-                <groupId>org.apache.maven.plugins</groupId>
-                <artifactId>maven-shade-plugin</artifactId>
-                <configuration>
-                    <shadedArtifactAttached>false</shadedArtifactAttached>
-                    <artifactSet>
-                        <includes>
-                            <include>com.google.guava:*</include>
-                        </includes>
-                    </artifactSet>
-                    <relocations>
-                        <relocation>
-                            <pattern>com.google.common</pattern>
-                            <shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
-                            <includes>
-                                <include>com.google.common.**</include>
-                            </includes>
-                        </relocation>
-                        <relocation>
-                            <pattern>com.google.thirdparty</pattern>
-                            <shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
-                            <includes>
-                                <include>com.google.thirdparty.**</include>
-                            </includes>
-                        </relocation>
-                        <relocation>
-                            <pattern>android.annotation</pattern>
-                            <shadedPattern>${spark.shade.packageName}.connect.android_annotation</shadedPattern>
-                        </relocation>
-                        <relocation>
-                            <pattern>com.google.errorprone.annotations</pattern>
-                            <shadedPattern>${spark.shade.packageName}.connect.errorprone_annotations</shadedPattern>
-                        </relocation>
-                    </relocations>
-                </configuration>
-            </plugin>
         </plugins>
     </build>
     <profiles>
diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index a4c8d62dd6e..4b077f593fe 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -875,15 +875,19 @@ object SparkConnectClient {
       cp filter { v =>
         val name = v.data.getName
         name.startsWith("pmml-model-") || name.startsWith("scala-collection-compat_") ||
-          name.startsWith("jsr305-") || name.startsWith("netty-") || name == "unused-1.0.0.jar"
+          name.startsWith("jsr305-") || name == "unused-1.0.0.jar"
       }
     },
 
     (assembly / assemblyShadeRules) := Seq(
-      ShadeRule.rename("io.grpc.**" -> "org.sparkproject.connect.client.grpc.@0").inAll,
-      ShadeRule.rename("com.google.protobuf.**" -> "org.sparkproject.connect.protobuf.@1").inAll,
-      ShadeRule.rename("com.google.common.**" -> "org.sparkproject.connect.client.guava.@1").inAll,
-      ShadeRule.rename("com.google.thirdparty.**" -> "org.sparkproject.connect.client.guava.@1").inAll,
+      ShadeRule.rename("io.grpc.**" -> "org.sparkproject.connect.client.io.grpc.@1").inAll,
+      ShadeRule.rename("com.google.**" -> "org.sparkproject.connect.client.com.google.@1").inAll,
+      ShadeRule.rename("io.netty.**" -> "org.sparkproject.connect.client.io.netty.@1").inAll,
+      ShadeRule.rename("org.checkerframework.**" -> "org.sparkproject.connect.client.org.checkerframework.@1").inAll,
+      ShadeRule.rename("javax.annotation.**" -> "org.sparkproject.connect.client.javax.annotation.@1").inAll,
+      ShadeRule.rename("io.perfmark.**" -> "org.sparkproject.connect.client.io.perfmark.@1").inAll,
+      ShadeRule.rename("org.codehaus.**" -> "org.sparkproject.connect.client.org.codehaus.@1").inAll,
+      ShadeRule.rename("android.annotation.**" -> "org.sparkproject.connect.client.android.annotation.@1").inAll
     ),
 
     (assembly / assemblyMergeStrategy) := {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org