You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by srowen <gi...@git.apache.org> on 2014/08/31 19:27:45 UTC

[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/2221

    SPARK-3330 [BUILD] Successive test runs with different profiles fail SparkSubmitSuite

    Maven-based Jenkins builds have been failing for a while:
    https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-Master-Maven-with-YARN/480/HADOOP_PROFILE=hadoop-2.4,label=centos/console
    
    One common cause is that on the second and subsequent runs of "mvn clean test", at least two assembly JARs will exist in assembly/target. Because assembly is not a submodule of parent, "mvn clean" is not invoked for assembly. The presence of two assembly jars causes spark-submit to fail.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-3330

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2221.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2221
    
----
commit 6d189ea166656135ad5803c59a4ddfc973f79c60
Author: Sean Owen <so...@cloudera.com>
Date:   2014-08-31T17:23:18Z

    Clean assembly/target from parent clean target since assembly is not a submodule. Otherwise multiple assemblies can accumulate during, say, Jenkins tests and fail unit tests

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-54037958
  
    Yes, PS, I did verify that this was the cause, by changing the code to print the stderr from the command that fails in SparkSubmitSuite. It was due to multiple assemblies, and I could reproduce making multiple assemblies of course locally by running what Jenkins does. I can't think of a reason that a top-level "mvn clean" *shouldn't* remove assemblies. So yeah 99% sure this is the fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-54214751
  
    Hm - I guess if I run `mvn clean test` it actually doesn't work. This is really surprising, I thought that mvn clean test just first runs the entire clean target.
    
    ```
    $ mkdir assembly/target && touch assembly/target/spark-assembly-1.0.2-hadoop1.0.4.jar
    $ mvn clean test
    <CTRL-C>
    $  ls assembly/target/
    spark-assembly-1.0.2-hadoop1.0.4.jar
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-53995057
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19542/consoleFull) for   PR 2221 at commit [`6d189ea`](https://github.com/apache/spark/commit/6d189ea166656135ad5803c59a4ddfc973f79c60).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by srowen <gi...@git.apache.org>.
Github user srowen closed the pull request at:

    https://github.com/apache/spark/pull/2221


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-54216391
  
    @pwendell Ah, of course. It's just that it runs "clean, test" for each module, not "clean" for all modules, then "test" for all modules. So when a subsequent "clean package" kicks off, it doesn't get to cleaning the assembly until after tests like core have already failed.
    
    So, this PR is kind of a hack, which also forcibly deletes assembly/target right at the start in the parent's clean lifecycle. A different, perhaps better answer is to "mvn clean && mvn ... package" in Jenkins.
    
    WDYT?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-53997191
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19542/consoleFull) for   PR 2221 at commit [`6d189ea`](https://github.com/apache/spark/commit/6d189ea166656135ad5803c59a4ddfc973f79c60).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-54215117
  
    @pwendell Huh! yeah I see the same thing. Jenkins is running "clean package". Let me see if I can dig out why the behavior is different. The difference must be that assembly's clean is not run with "mvn clean" but is when other lifecycles are included. That would explain the behaviors seen so far. But I'd like to figure out why.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by ScrapCodes <gi...@git.apache.org>.
Github user ScrapCodes commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-54022209
  
    Hey @srowen Thanks for fixing this. I feel your argument is plausible, so I am not verifying it. The change looks reasonable too.
    
    Looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3330 [BUILD] Successive test runs with d...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2221#issuecomment-54213979
  
    Hey @srowen - I'm having trouble reproduce the issue. If I run "mvn clean" it seems to always delete the entire `assembly/target` directory, regardless of which profiles are enabled.
    
    ```
    $ mkdir assembly/target && touch assembly/target/spark-assembly-1.0.2-hadoop1.0.4.jar
    $ mvn clean
    $  ls assembly/target/
    ls: cannot access assembly/target/: No such file or directory
    ```
    
    One thing - I did update our test harness so that it does a full git force-clean before the tests run (we recently did this to the pull request builder). So maybe this will separately fix the issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org