You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ChenjunZou <gi...@git.apache.org> on 2017/10/11 03:38:48 UTC

[GitHub] spark pull request #19469: [SPARK-22243][DStreams]spark.yarn.jars reload fro...

GitHub user ChenjunZou opened a pull request:

    https://github.com/apache/spark/pull/19469

    [SPARK-22243][DStreams]spark.yarn.jars reload from config when Checkpoint recovery

    add spark.yarn.jars to the checkpoint reload configs.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ChenjunZou/spark checkpoint-yarn-jars

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19469.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19469
    
----
commit 7a03073d0cdca3e65ada1e710a96cfadfa332d05
Author: ZouChenjun <zo...@youzan.com>
Date:   2017-10-10T12:34:07Z

    set spark.yarn.jars reload from config

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    Sorry for the delay. I think it's not worth to design a new feature that's only for DStream. Instead, I would encourage people to use Structured Streaming in the new Spark versions.
    
    I'm okey to just merge this PR and future minor PRs (I don't think it will be a lot) for the similar issues. @ChenjunZou could you reopen this PR, please?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    FYI, looks like @ChenjunZou opened a new PR #19637 rather than reopening this. Since the content is the same, I just merged #19637.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    @ssaavedra , yes I think so. with the pull-in of k8s support, I would guess more configurations need to be added to exclusion rule. With current solution, one by one PR doesn't make so sense. We should either figure out a general solution or refactor this part. 
    
    Besides, as we moved to structured streaming, do we need to pay more efforts on these issues? @zsxwing 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    **[Test build #82621 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82621/testReport)** for PR 19469 at commit [`7a03073`](https://github.com/apache/spark/commit/7a03073d0cdca3e65ada1e710a96cfadfa332d05).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    LGTM. cc @jerryshao to take a look.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    **[Test build #82621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82621/testReport)** for PR 19469 at commit [`7a03073`](https://github.com/apache/spark/commit/7a03073d0cdca3e65ada1e710a96cfadfa332d05).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by ssaavedra <gi...@git.apache.org>.
Github user ssaavedra commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    I think that may be a good idea. I'd say this can depend on the scheduler. Should that be discussed under a different JIRA number?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82621/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19469: [SPARK-22243][DStreams]spark.yarn.jars reload fro...

Posted by ChenjunZou <gi...@git.apache.org>.
Github user ChenjunZou closed the pull request at:

    https://github.com/apache/spark/pull/19469


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    > sure, @zsxwing please beaware of apache-spark-on-k8s#516 and #19427
    
    Yeah. I'm aware of them. I will review #19427.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    Ah, I didn't realize there is a change in that PR.
    I agree we need a better solution
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    sure, @zsxwing please beaware of https://github.com/apache-spark-on-k8s/spark/pull/516 and https://github.com/apache/spark/pull/19427


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by ssaavedra <gi...@git.apache.org>.
Github user ssaavedra commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    The one I submitted here https://issues.apache.org/jira/browse/SPARK-22294 is not kubernetes-related as such, although it does affect deployments in Kubernetes. It should affect any spark-submit done with a custom bindAddress.
    
    But yes, there might be a k8s-related config at some point, just like there is specific configuration for YARN already.
    
    But the currently proposed PRs are just bugs caught at some point by other PRs. Maybe they should be merged and then a more general architecture can be proposed? This way we'll have all the properties known to need reloading already in the code before refactoring.
    
    Is there anything we should know about structured streaming in regards to checkpoints?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    @jerryshao I think it's a good idea to have a design to work with different resource manager.
    
    As of now though there is only one additional config needed for k8s, and from this PR one for yarn.
    What's next?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    There's a similar PR #19427 , I was wondering if we can provide a general solution for such issues, like using a configuration to specify all the confs which needs to be reloaded, spark.streaming.confsToReload = spark.yarn.jars,spark.xx.xx. So that we don't need to fix related issues again and again. What do you think?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    @felixcheung As you can see there's bunch of configurations needs to be added here in https://github.com/apache-spark-on-k8s/spark/pull/516, that's why I'm asking a general solutions for such related issue.
    
    I'm OK to merge this PR. But I would suspect similar PRs will still be created in future, since those issues are quite scenario specific, users may have different scenarios and can touch different issues regarding to this. So I'm just wondering if we could have a better solution for this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19469: [SPARK-22243][DStreams]spark.yarn.jars reload from confi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19469
  
    @ChenjunZou did you get a chance to look at my left comment?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org