You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JoshRosen <gi...@git.apache.org> on 2014/08/27 22:29:17 UTC

[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/2165

    [SPARK-3061] Fix Maven build under Windows

    The Maven build was failing on Windows because it tried to call the unix `unzip` utility to extract the Py4J files into core's build directory.  I've fixed this issue by using the `maven-antrun-plugin` to perform the unzipping.
    
    I also fixed an issue that prevented tests from running under Windows:
    
    In the Maven ScalaTest plugin, the filename listed in <filereports> is placed under the <reportsDirectory>; the current code places it in a subdirectory of reportsDirectory, e.g.
    
    ```
    ${project.build.directory}/surefire-reports/${project.build.directory}/SparkTestSuite.txt
    ```
    
    This caused problems under Windows because it would try to create a subdirectory named "c:\\".
    
    Note that the tests still fail under Windows (for other reasons); this PR just allows them to run and fail rather than crash when trying to create the test reports directory.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark windows-support

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2165.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2165
    
----
commit 4994af1a168656f910a236a8b95dd02ae17b2e44
Author: Josh Rosen <jo...@apache.org>
Date:   2014-08-26T00:10:51Z

    [SPARK-3061] Use maven-antrun-plugin to unzip Py4J.
    
    This fixes the Maven build on Windows.

commit e347668f5ae522b267966fd98a097eb9ca6e302a
Author: Josh Rosen <jo...@apache.org>
Date:   2014-08-26T05:41:28Z

    Fix Maven scalatest filereports path:
    
    The filename listed in <filereports> is placed under the <reportsDirectory>;
    the current code places it in a subdirectory of reportsDirectory, e.g.
    
    ${project.build.directory}/surefire-reports/${project.build.directory}/SparkTestSuite.txt
    
    This caused problems under Windows because it would try to create
    a subdirectory with a colon in its name.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2165#discussion_r16803715
  
    --- Diff: core/pom.xml ---
    @@ -306,26 +306,20 @@
           </plugin>
           <!-- Unzip py4j so we can include its files in the jar -->
           <plugin>
    -        <groupId>org.codehaus.mojo</groupId>
    -        <artifactId>exec-maven-plugin</artifactId>
    -        <version>1.2.1</version>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-antrun-plugin</artifactId>
             <executions>
               <execution>
                 <phase>generate-resources</phase>
                 <goals>
    -              <goal>exec</goal>
    +              <goal>run</goal>
                 </goals>
               </execution>
             </executions>
             <configuration>
    -          <executable>unzip</executable>
    -          <workingDirectory>../python</workingDirectory>
    -          <arguments>
    -            <argument>-o</argument>
    -            <argument>lib/py4j*.zip</argument>
    -            <argument>-d</argument>
    -            <argument>build</argument>
    -          </arguments>
    +          <tasks>
    +              <unzip src="../python/lib/py4j-0.8.2.1-src.zip" dest="build" />
    --- End diff --
    
    nit: 2 tabs instead of 4


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53636500
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19334/consoleFull) for   PR 2165 at commit [`e347668`](https://github.com/apache/spark/commit/e347668f5ae522b267966fd98a097eb9ca6e302a).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53639832
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19344/consoleFull) for   PR 2165 at commit [`fbf3e61`](https://github.com/apache/spark/commit/fbf3e61cad67eee54dd1eccaac8983fdfc9a6be8).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53819810
  
    @andrewor14 Unless the new unzipping code is somehow corrupting the files or unzipping them to a different location, won't it be sufficient to check that the same files are present in the assembly JAR without having to run an actual YARN cluster?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2165


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53819971
  
    I suppose so. I am extra cautious about this because the whole PySpark on YARN thing is somewhat fickle. On a related note I was able to verify that the python files were actually there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53644802
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19334/consoleFull) for   PR 2165 at commit [`e347668`](https://github.com/apache/spark/commit/e347668f5ae522b267966fd98a097eb9ca6e302a).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class MutableLiteral(var value: Any, dataType: DataType, nullable: Boolean = true) `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53655072
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53810408
  
    Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53819044
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19433/consoleFull) for   PR 2165 at commit [`651d210`](https://github.com/apache/spark/commit/651d210a1b3bd0535078b4e2018d7a0d012d4268).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-54189498
  
    Ok, I will add a reminder for myself to back port this after the release. Thanks Josh.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53810682
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19433/consoleFull) for   PR 2165 at commit [`651d210`](https://github.com/apache/spark/commit/651d210a1b3bd0535078b4e2018d7a0d012d4268).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53819474
  
    I tested this on Windows and it works there. I still have to test PySpark on Yarn with this PR because it might break things there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53819995
  
    Re: can this break something -- we have seen in recent memory how zip behavior can be inconsistent with archives of >= 65536 files, which comes up with Spark assembly jars. I assume that's not true of the .zip file being unzipped here though. In that case this does seem like something that will work everywhere if it works anywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-54119183
  
    Hey @andrewor14 you can merge this into master and just leave the JIRA as unresolved and add targetVersion 1.1.1 and fixVersion 1.2.0 and make it a blocker. That way it will be clear it needs to be backported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-54101063
  
    LGTM. I tested this extensively locally on Windows and OSX, and remotely on a HDP cluster and on a standalone cluster. PySpark works as expected on all environments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3061] Fix Maven build under Windows

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2165#issuecomment-53645973
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19344/consoleFull) for   PR 2165 at commit [`fbf3e61`](https://github.com/apache/spark/commit/fbf3e61cad67eee54dd1eccaac8983fdfc9a6be8).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org