You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/24 16:56:11 UTC

[GitHub] [spark] sunchao opened a new pull request #34100: [SPARK-36835][BUILD][hadoop-2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

sunchao opened a new pull request #34100:
URL: https://github.com/apache/spark/pull/34100


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   Fix an issue where Maven may stuck in an infinite loop when building Spark, for Hadoop 2.7 profile.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   After re-enabling `createDependencyReducedPom` for `maven-shade-plugin`, Spark build stopped working for Hadoop 2.7 profile and will stuck in an infinitely loop, likely due to a Maven shade plugin bug similar to https://issues.apache.org/jira/browse/MSHADE-148. This seems to be caused by the fact that, under `hadoop-2.7` profile, variable `hadoop-client-runtime.artifact` and `hadoop-client-api.artifact`are both `hadoop-client` which triggers the issue. As a workaround, this moves the former into a `hadoop-3.2` profile section.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   No.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   
   N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926977150


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48127/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926890177


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48123/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926977150


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48127/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926845114


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48120/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715955928



##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
         <curator.version>2.7.1</curator.version>
         <commons-io.version>2.4</commons-io.version>
         <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-        <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+        <hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>

Review comment:
       Thanks for taking a look. Yes I think it's better to apply the same for `hadoop-client-minicluster.artifact. Let me try that, and perhaps we won't need the changes in YARN's pom.xml with this.
   
   The side effect for this is seems to be that it affects the _distance_ of these dependencies to the root module and thus may make a difference when maven tries to resolve a dependency with multiple versions (see [here](https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html) for reference). I was using `hadoop-common` (which carries lots of dependencies) instead of `hadoop-yarn-api` and it was not able to compile.
   
   Will update PR description and the comment in the above pom.xml.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927342609


   oooh I see, it is because only one active profile is allowed, and thus when `hadoop-provided` is activated, `hadoop-3.2` will not be, and the Hadoop dependencies will not be found.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926920193


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926840225


   **[Test build #143611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143611/testReport)** for PR 34100 at commit [`4456fc1`](https://github.com/apache/spark/commit/4456fc150a1ac0da6b8b2501976772311fefdb55).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927001227


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48130/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang closed pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang closed pull request #34100:
URL: https://github.com/apache/spark/pull/34100


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927342887


   I think adding `-Phadoop-3.2` should make it work but let me think how to fix this.
   
   ```
   build/mvn clean package -DskipTests -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-provided -Phadoop-3.2
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926859313


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48120/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927342108


   @gengliangwang will take a look - is it caused by this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926987611


   **[Test build #143618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143618/testReport)** for PR 34100 at commit [`0c358b3`](https://github.com/apache/spark/commit/0c358b34a14c59158bff018777388605abf42dc3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][hadoop-2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926792624


   **[Test build #143608 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143608/testReport)** for PR 34100 at commit [`4456fc1`](https://github.com/apache/spark/commit/4456fc150a1ac0da6b8b2501976772311fefdb55).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927337940


   @sunchao Unfortunately, the build without Hadoop failed after this one. 
   ```
   $ ./build/mvn clean package -DskipTests -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-provided
   
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/BaseYarnClusterSuite.scala:30: object MiniYARNCluster is not a member of package org.apache.hadoop.yarn.server
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/BaseYarnClusterSuite.scala:61: not found: type MiniYARNCluster
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/BaseYarnClusterSuite.scala:104: not found: type MiniYARNCluster
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:37: object resourcemanager is not a member of package org.apache.hadoop.yarn.server
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:38: object resourcemanager is not a member of package org.apache.hadoop.yarn.server
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:39: object resourcemanager is not a member of package org.apache.hadoop.yarn.server
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:40: object resourcemanager is not a member of package org.apache.hadoop.yarn.server
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:250: not found: type RMContext
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:252: not found: type RMApp
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:261: not found: type RMApplicationHistoryWriter
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:263: not found: type SystemMetricsPublisher
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:267: not found: type RMAppManager
   [ERROR] [Error] /opt/spark-rm/output/spark-3.2.0-bin-without-hadoop/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:272: not found: type ClientRMService
   ```
   
   Could you fix it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927003708


   **[Test build #143618 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143618/testReport)** for PR 34100 at commit [`0c358b3`](https://github.com/apache/spark/commit/0c358b34a14c59158bff018777388605abf42dc3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926840225


   **[Test build #143611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143611/testReport)** for PR 34100 at commit [`4456fc1`](https://github.com/apache/spark/commit/4456fc150a1ac0da6b8b2501976772311fefdb55).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926987299


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143615/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] JoshRosen commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
JoshRosen commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715946127



##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
         <curator.version>2.7.1</curator.version>
         <commons-io.version>2.4</commons-io.version>
         <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-        <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+        <hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>

Review comment:
       Ahhh, this is a clever fix:
   
   Instead of the `hadoop-2.7` profile resulting in a duplicate direct dependency on `hadoop-client`, we now just declare an explicit dependency on one of `hadoop-client`'s transitive dependencies (`hadoop-yarn-api` in this case). Anything which depends on `hadoop-client-runtime.artifact` must also depend on `hadoop-client-api.artifact`, so this doesn't end up changing the set of dependencies pulled in.
   
   It looks like we didn't need to do that for `hadoop-client-minicluster.artifact` because that's only used in the `resource-managers/yarn` POM and that's already using Maven profiles to control the dependency selection (so the other workaround is less invasive in that context). In principle, though, I guess we could have changed that to some other transitive dep.
   
   ---
   
   Could you maybe add a one or two line comment above these Hadoop 2.7 lines to explain what's going on? And maybe edit the comment at https://github.com/apache/spark/blob/d73562ed3635bb3454ac67029ca6541b30ae0c02/pom.xml#L251-L255 to reflect this change? This fix is clever but a little subtle, so I think a comment calling it out (and maybe mentioning SPARK-36835) might help future readers.
   
   **Edit:** could you also update the PR description to reflect this final fix? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926987611


   **[Test build #143618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143618/testReport)** for PR 34100 at commit [`0c358b3`](https://github.com/apache/spark/commit/0c358b34a14c59158bff018777388605abf42dc3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] JoshRosen commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
JoshRosen commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715946127



##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
         <curator.version>2.7.1</curator.version>
         <commons-io.version>2.4</commons-io.version>
         <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-        <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+        <hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>

Review comment:
       Ahhh, this is a clever fix:
   
   Instead of the `hadoop-2.7` profile resulting in a duplicate direct dependency on `hadoop-client`, we now just declare an explicit dependency on one of `hadoop-client`'s transitive dependencies (`hadoop-yarn-api` in this case). Anything which depends on `hadoop-client-runtime.artifact` must also depend on `hadoop-client-api.artifact`, so this doesn't end up changing the set of dependencies pulled in.
   
   It looks like we didn't need to do that for `hadoop-client-minicluster.artifact` because that's only used in the `resource-managers/yarn` POM and that's already using Maven profiles to control the dependency selection (so the other workaround is less invasive in that context). In principle, though, I guess we could have changed that to some other transitive dep.
   
   ---
   
   Could you maybe add a one or two line comment above these Hadoop 2.7 lines to explain what's going on? And maybe edit the comment at https://github.com/apache/spark/blob/d73562ed3635bb3454ac67029ca6541b30ae0c02/pom.xml#L251-L255 to reflect this change? This fix is clever but a little subtle, so I think a comment calling it out (and maybe mentioning SPARK-36835 might help future readers.
   
   **Edit:** could you also update the PR description to reflect this final fix? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927042322


   I tried `build/mvn -DskipTests -Phadoop-2.7 clean install` and it works now. Shall we merge this and start RC5? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926836526


   Thank you for the fix @sunchao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][hadoop-2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926819781


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48120/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926868111


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48123/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926965463


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48127/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][hadoop-2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715808531



##########
File path: launcher/pom.xml
##########
@@ -85,14 +85,33 @@
       <version>${hadoop.version}</version>
       <scope>test</scope>
     </dependency>
-    <dependency>
-      <groupId>org.apache.hadoop</groupId>
-      <artifactId>${hadoop-client-runtime.artifact}</artifactId>
-      <version>${hadoop.version}</version>
-      <scope>test</scope>
-    </dependency>
   </dependencies>
 
+  <profiles>
+    <profile>
+      <id>hadoop-2.7</id>
+    </profile>
+    <profile>
+      <id>hadoop-3.2</id>
+      <activation>
+        <activeByDefault>true</activeByDefault>
+      </activation>
+      <!--
+        Only declare for Hadoop 3.2 profile. Otherwise maven-shade-plugin may stuck in an infinite
+        loop building the dependency-reduced pom, perhaps due to a maven bug:
+          https://issues.apache.org/jira/browse/MSHADE-148

Review comment:
       Thank you for the pointer. According to the discussion on that issue, it's marked as resolved, but not fixed yet?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][hadoop-2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715809142



##########
File path: launcher/pom.xml
##########
@@ -85,14 +85,33 @@
       <version>${hadoop.version}</version>
       <scope>test</scope>
     </dependency>
-    <dependency>
-      <groupId>org.apache.hadoop</groupId>
-      <artifactId>${hadoop-client-runtime.artifact}</artifactId>
-      <version>${hadoop.version}</version>
-      <scope>test</scope>
-    </dependency>
   </dependencies>
 
+  <profiles>
+    <profile>
+      <id>hadoop-2.7</id>
+    </profile>
+    <profile>
+      <id>hadoop-3.2</id>
+      <activation>
+        <activeByDefault>true</activeByDefault>
+      </activation>
+      <!--
+        Only declare for Hadoop 3.2 profile. Otherwise maven-shade-plugin may stuck in an infinite
+        loop building the dependency-reduced pom, perhaps due to a maven bug:
+          https://issues.apache.org/jira/browse/MSHADE-148

Review comment:
       Right, it's not fixed yet. See the last comment in the JIRA.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926792624


   **[Test build #143608 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143608/testReport)** for PR 34100 at commit [`4456fc1`](https://github.com/apache/spark/commit/4456fc150a1ac0da6b8b2501976772311fefdb55).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926975265


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48127/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926890689






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926993508


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48130/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] JoshRosen commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
JoshRosen commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715946127



##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
         <curator.version>2.7.1</curator.version>
         <commons-io.version>2.4</commons-io.version>
         <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-        <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+        <hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>

Review comment:
       Ahhh, this is a clever fix:
   
   Instead of the `hadoop-2.7` profile resulting in a duplicate direct dependency on `hadoop-client`, we now just declare an explicit dependency on one of `hadoop-client`'s transitive dependencies (`hadoop-yarn-api` in this case). Anything which depends on `hadoop-client-runtime.artifact` must also depend on `hadoop-client-api.artifact`, so this doesn't end up changing the set of dependencies pulled in.
   
   It looks like we didn't need to do that for `hadoop-client-minicluster.artifact` because that's only used in the `resource-managers/yarn` POM and that's already using Maven profiles to control the dependency selection (so the other workaround is less invasive in that context). In principle, though, I guess we could have changed that to some other transitive dep.
   
   ---
   
   Could you maybe add a one or two line comment above these Hadoop 2.7 lines to explain what's going on? And maybe edit the comment at https://github.com/apache/spark/blob/d73562ed3635bb3454ac67029ca6541b30ae0c02/pom.xml#L251-L255 to reflect this change? This fix is clever but a little subtle, so I think a comment calling it out (and maybe mentioning SPARK-36835) might help future readers.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926892252


   Thanks @JoshRosen ! it's interesting that this error is not reported in the CI jobs. So the enforcer rule is only executed in `install` phase but not `package`? 
   
   Let me double verify locally and add the changes to all the necessary modules.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927003864


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143618/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926953096


   **[Test build #143615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143615/testReport)** for PR 34100 at commit [`d73562e`](https://github.com/apache/spark/commit/d73562ed3635bb3454ac67029ca6541b30ae0c02).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927224824


   branch-3.2 also seems to need this fix
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927499719


   @sunchao yes let's try option 1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][hadoop-2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926781659


   let me test both Hadoop profiles here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715955928



##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
         <curator.version>2.7.1</curator.version>
         <commons-io.version>2.4</commons-io.version>
         <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-        <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+        <hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>

Review comment:
       Thanks for taking a look. Yes I think it's better to apply the same for `hadoop-client-minicluster.artifact`. Let me try that, and perhaps we won't need the changes in YARN's pom.xml with this.
   
   The side effect for this is seems to be that it affects the _distance_ of these dependencies to the root module and thus may make a difference when maven tries to resolve a dependency with multiple versions (see [here](https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html) for reference). I was using `hadoop-common` (which carries lots of dependencies) instead of `hadoop-yarn-api` and it was not able to compile.
   
   Will update PR description and the comment in the above pom.xml.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926952762


   updated the PR to use different name for `hadoop-client-runtime.artifact`, which is probably a simpler approach. Verified locally with:
   ```
   build/mvn clean install -DskipTests -Phadoop-2.7 -Phive-2.3 -Pmesos -Phive-thriftserver -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Pkubernetes -Phadoop-cloud -Phive
   ```
   and the build is successful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927003864


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143618/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927001227


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48130/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926986994


   **[Test build #143615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143615/testReport)** for PR 34100 at commit [`d73562e`](https://github.com/apache/spark/commit/d73562ed3635bb3454ac67029ca6541b30ae0c02).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926876944


   **[Test build #143608 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143608/testReport)** for PR 34100 at commit [`4456fc1`](https://github.com/apache/spark/commit/4456fc150a1ac0da6b8b2501976772311fefdb55).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926859313


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48120/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926836114


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927345684


   There are two ways to fix this:
   1. move `spark.yarn.isHadoopProvided` to Spark parent pom, so that `-Phadoop-3.2` can become the default profile in the YARN module's pom. I don't see any side effect on this - ideally this property can be more general such as `spark.isHadoopProvided`.
   2. move `hadoop-client-runtime.artifact` out of the `-Phadoop-3.2` profile. It should fix the build issue but someone that's using `-Phadoop-provided` to test Hadoop 3.2 could still fail. 
   
   I'm inclined to option 1) here but let me know if you have any thoughts on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927236621


   Merging to master/3.2. Thanks all!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927198290


   Is there any change in your previous opinion, @JoshRosen ?
   - https://github.com/apache/spark/pull/34100#discussion_r715946127


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927237411


   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926911899


   **[Test build #143611 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143611/testReport)** for PR 34100 at commit [`4456fc1`](https://github.com/apache/spark/commit/4456fc150a1ac0da6b8b2501976772311fefdb55).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][hadoop-2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926827734


   It looks reasonable to me. I believe we need @gengliangwang 's sign-off with Hadoop 2.7 testing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927344060


   @sunchao Yes it is caused by this one. Again, thanks for looking into this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926890689






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926998644


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48130/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] JoshRosen commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
JoshRosen commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715946127



##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
         <curator.version>2.7.1</curator.version>
         <commons-io.version>2.4</commons-io.version>
         <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-        <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+        <hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>

Review comment:
       Ahhh, this is a clever fix:
   
   Instead of the `hadoop-2.7` profile resulting in a duplicate direct dependency on `hadoop-client`, we now just declare an explicit dependency on one of `hadoop-client`'s transitive dependencies (`hadoop-yarn-api` in this case). Anything which depends on `hadoop-client-runtime.artifact` must also depend on `hadoop-client-api.artifact`, so this doesn't end up changing the set of dependencies pulled in.
   
   It looks like we didn't need to do that for `hadoop-client-minicluster.artifact` because that's only used in the `resource-managers/yarn` POM and that's already using Maven profiles to control the dependency selection (so the other workaround is fairly non-invasive in that context). In principle, though, I guess we could have changed that to some other transitive dep.
   
   ---
   
   Could you maybe add a one or two line comment above these Hadoop 2.7 lines to explain what's going on? And maybe edit the comment at https://github.com/apache/spark/blob/d73562ed3635bb3454ac67029ca6541b30ae0c02/pom.xml#L251-L255 to reflect this change? This fix is clever but a little subtle, so I think a comment calling it out (and maybe mentioning SPARK-36835 might help future readers.
   
   **Edit:** could you also update the PR description to reflect this final fix? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926953096


   **[Test build #143615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143615/testReport)** for PR 34100 at commit [`d73562e`](https://github.com/apache/spark/commit/d73562ed3635bb3454ac67029ca6541b30ae0c02).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on a change in pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #34100:
URL: https://github.com/apache/spark/pull/34100#discussion_r715966900



##########
File path: pom.xml
##########
@@ -3273,7 +3273,7 @@
         <curator.version>2.7.1</curator.version>
         <commons-io.version>2.4</commons-io.version>
         <hadoop-client-api.artifact>hadoop-client</hadoop-client-api.artifact>
-        <hadoop-client-runtime.artifact>hadoop-client</hadoop-client-runtime.artifact>
+        <hadoop-client-runtime.artifact>hadoop-yarn-api</hadoop-client-runtime.artifact>

Review comment:
       Actually it may not be so useful to change `hadoop-client-minicluster.artifact` since it is test scope while the other two are compile scope by default. For some reason it also changes `dev/deps/spark-deps-hadoop-2.7-hive-2.3` when I set it to something like `hadoop-mapreduce-client-jobclient`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926920193


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-926987299


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143615/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org