You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2018/08/05 22:04:38 UTC

[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/22003

    [SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules

    ## What changes were proposed in this pull request?
    
    During upgrading Apache ORC to 1.5.2 ([SPARK-24576](https://issues.apache.org/jira/browse/SPARK-24576)), `sql/core` module overrides the exclusion rules of parent pom file and it causes published `spark-sql_2.1X` artifacts have incomplete exclusion rules ([SPARK-25019](https://issues.apache.org/jira/browse/SPARK-25019)). This PR fixes it by moving the newly added exclusion rule to the parent pom. This also fixes the sbt build hack introduced at that time.
    
    ## How was this patch tested?
    
    Pass the existing dependency check and the tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-25019

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22003.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22003
    
----
commit a801498f249d7526b64fcb9fe8144325ebb3d4e4
Author: Dongjoon Hyun <do...@...>
Date:   2018-08-05T21:34:25Z

    [SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22003#discussion_r207985833
  
    --- Diff: sql/core/pom.xml ---
    @@ -90,39 +90,11 @@
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-core</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
         </dependency>
         <dependency>
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-mapreduce</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
    --- End diff --
    
    It was because it overrode and simply ignored the parent's exclusions. To publish locally, you can use maven install like `build/mvn -T C1 -DskipTests clean install`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94254/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22003#discussion_r207888608
  
    --- Diff: sql/core/pom.xml ---
    @@ -90,39 +90,11 @@
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-core</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
         </dependency>
         <dependency>
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-mapreduce</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
    --- End diff --
    
    @dongjoon-hyun when we publish snapshot artifacts or releases, will the pom for spark sql get all of exclusions defined in the parent pom?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1824/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    Hi, @yhuai . Could you review this PR? The following is the new pom.file for `spark-sql_2.11`.
    ```
        <dependency>
          <groupId>org.apache.orc</groupId>
          <artifactId>orc-core</artifactId>
          <version>1.5.2</version>
          <classifier>nohive</classifier>
          <scope>compile</scope>
          <exclusions>
            <exclusion>
              <artifactId>hadoop-common</artifactId>
              <groupId>org.apache.hadoop</groupId>
            </exclusion>
            <exclusion>
              <artifactId>hadoop-hdfs</artifactId>
              <groupId>org.apache.hadoop</groupId>
            </exclusion>
            <exclusion>
              <artifactId>hive-storage-api</artifactId>
              <groupId>org.apache.hive</groupId>
            </exclusion>
          </exclusions>
        </dependency>
        <dependency>
          <groupId>org.apache.orc</groupId>
          <artifactId>orc-mapreduce</artifactId>
          <version>1.5.2</version>
          <classifier>nohive</classifier>
          <scope>compile</scope>
          <exclusions>
            <exclusion>
              <artifactId>hadoop-common</artifactId>
              <groupId>org.apache.hadoop</groupId>
            </exclusion>
            <exclusion>
              <artifactId>hadoop-mapreduce-client-core</artifactId>
              <groupId>org.apache.hadoop</groupId>
            </exclusion>
            <exclusion>
              <artifactId>orc-core</artifactId>
              <groupId>org.apache.orc</groupId>
            </exclusion>
            <exclusion>
              <artifactId>hive-storage-api</artifactId>
              <groupId>org.apache.hive</groupId>
            </exclusion>
            <exclusion>
              <artifactId>guava</artifactId>
              <groupId>com.google.guava</groupId>
            </exclusion>
          </exclusions>
        </dependency>
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    @dongjoon-hyun  no problem. Thank you!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    **[Test build #94254 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94254/testReport)** for PR 22003 at commit [`a801498`](https://github.com/apache/spark/commit/a801498f249d7526b64fcb9fe8144325ebb3d4e4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    **[Test build #94254 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94254/testReport)** for PR 22003 at commit [`a801498`](https://github.com/apache/spark/commit/a801498f249d7526b64fcb9fe8144325ebb3d4e4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22003#discussion_r207962501
  
    --- Diff: sql/core/pom.xml ---
    @@ -90,39 +90,11 @@
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-core</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
         </dependency>
         <dependency>
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-mapreduce</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
    --- End diff --
    
    Thank you. Just for me to understand it better. Do you know why defining exclusions in this pom file messed up the pom?
    
    Also, how should I try it out myself? What is the right command to publish locally?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    lgtm. Merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    Thank you, @yhuai !


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22003
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22003


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22003#discussion_r207986831
  
    --- Diff: sql/core/pom.xml ---
    @@ -90,39 +90,11 @@
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-core</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
         </dependency>
         <dependency>
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-mapreduce</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
    --- End diff --
    
    got it. Thank you.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22003#discussion_r207960480
  
    --- Diff: sql/core/pom.xml ---
    @@ -90,39 +90,11 @@
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-core</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
         </dependency>
         <dependency>
           <groupId>org.apache.orc</groupId>
           <artifactId>orc-mapreduce</artifactId>
           <classifier>${orc.classifier}</classifier>
    -      <exclusions>
    -        <exclusion>
    -          <groupId>org.apache.hadoop</groupId>
    -          <artifactId>hadoop-hdfs</artifactId>
    -        </exclusion>
    -        <!--
    -          orc-core:nohive doesn't have this dependency, but we adds this to prevent
    -          sbt from getting confused.
    -        -->
    -        <exclusion>
    -          <groupId>org.apache.hive</groupId>
    -          <artifactId>hive-storage-api</artifactId>
    -        </exclusion>
    -      </exclusions>
    --- End diff --
    
    Yes. The above snippet comes from the pom file published locally in my laptop; `~/.m2/repository/org/apache/spark/spark-sql_2.11/2.4.0-SNAPSHOT/spark-sql_2.11-2.4.0-SNAPSHOT.pom`. This is the same way before.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org