You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by liorregev <gi...@git.apache.org> on 2017/05/15 13:59:20 UTC

[GitHub] spark pull request #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs...

GitHub user liorregev opened a pull request:

    https://github.com/apache/spark/pull/17986

    [SPARK-20741][Spark Submit] Added cleanup of JARs archive generated by SparkSubmit

    ## What changes were proposed in this pull request?
    
    Deleted generated JARs archive after distribution to HDFS
    
    ## How was this patch tested?
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liorregev/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17986.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17986
    
----
commit cb03d8a891d1c1010df40abb8aa6f6221977d97f
Author: Lior Regev <li...@gmail.com>
Date:   2017-05-15T09:04:07Z

    [SPARK-20741] Added cleanup of JARs archive generated by SparkSubmit

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    **[Test build #3751 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3751/testReport)** for PR 17986 at commit [`cb03d8a`](https://github.com/apache/spark/commit/cb03d8a891d1c1010df40abb8aa6f6221977d97f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    That seems OK to me. It might be a good time to address similar issues elsewhere. For instance, look at `Client.createConfArchive`. The one place it's called, I think the file can be deleted after it's uploaded. There are a few other potential situations like this we could clean up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17986


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    I don't think this is really necessary. These files are created in `Utils.getLocalDir`, which on the launcher side is a temporary directory (see `Utils.getOrCreateLocalRootDirsImpl`). Meaning that as soon as the launcher exits, these files will be deleted.
    
    If you really want to fix this instance, it may be better to follow Sean's suggestion and fix all instances, creating an explicit temporary directory where the files are stored. All this is going to do, though, is to delete the files earlier - they'd still be deleted when the process exits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    > I just figured it would have been a better solution.
    
    It might be a good idea to do it, but then you can't just add this one line, you have to look at all the temp files that Client.scala generates.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    @liorregev if you'll take care of a couple other cases like this here, it looks OK to merge. Proactively cleaning up seems reasonable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    **[Test build #3753 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3753/testReport)** for PR 17986 at commit [`cb03d8a`](https://github.com/apache/spark/commit/cb03d8a891d1c1010df40abb8aa6f6221977d97f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    Merged to master/2.2. It's a win and on second look it wasn't obvious that there's another instance of this that can safely be cleaned up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    **[Test build #3751 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3751/testReport)** for PR 17986 at commit [`cb03d8a`](https://github.com/apache/spark/commit/cb03d8a891d1c1010df40abb8aa6f6221977d97f).
     * This patch passes all tests.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    **[Test build #3753 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3753/testReport)** for PR 17986 at commit [`cb03d8a`](https://github.com/apache/spark/commit/cb03d8a891d1c1010df40abb8aa6f6221977d97f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by liorregev <gi...@git.apache.org>.
Github user liorregev commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    Actually I ran into a problem with this not getting cleaned up. 
    After your explanation I can understand why it wasn't deleted. 
    I am running spark on EMR and the easiest way to programmatically submit applications to the cluster was to create an HTTP service that accepts the application details and programmatically calls SparkSubmit.main so the process never really exits. 
    I managed to solve this with spark.yarn. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17986: [SPARK-20741][Spark Submit] Added cleanup of JARs archiv...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17986
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org