You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zsxwing <gi...@git.apache.org> on 2017/05/01 18:01:38 UTC

[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

GitHub user zsxwing opened a pull request:

    https://github.com/apache/spark/pull/17821

    [SPARK-20529][Core]Allow worker and master work with a proxy server

    ## What changes were proposed in this pull request?
    
    In the current codes, when worker connects to master, master will send its address to the worker. Then worker will save this address and use it to reconnect in case of failure. However, sometimes, this address is not correct. If there is a proxy between master and worker, the address master sent is not the address of proxy.
    
    In this PR, the master address used by the worker will be sent to the master, then master just replies this address back, worker will use this address to reconnect in case of failure. In other words, the worker will use the config master address set in the worker side if possible rather than the master address set in the master side.
    
    There is still one potential issue though. When a master is restarted or takes over leadership, the work will use the address sent from the master to connect. If there is still a proxy between  master and worker, the address may be wrong. However, there is no way to figure it out just in the worker.
    
    ## How was this patch tested?
    
    The new added unit test.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zsxwing/spark SPARK-20529

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17821.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17821
    
----
commit 8ded9b197cc7ef3cdb32858da385cf9f900deb7d
Author: Shixiong Zhu <sh...@databricks.com>
Date:   2017-04-28T23:04:48Z

    Fix SPARK-20529

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76392/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17821#discussion_r114419550
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala ---
    @@ -80,8 +91,16 @@ private[deploy] object DeployMessages {
     
       sealed trait RegisterWorkerResponse
     
    -  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: String) extends DeployMessage
    -    with RegisterWorkerResponse
    +  /**
    +   * @param master the master ref
    +   * @param masterWebUiUrl the master Web UI address
    +   * @param masterAddress the master address used by the worker to connect. It should be
    +   *                      [[RegisterWorker.masterAddress]].
    +   */
    +  case class RegisteredWorker(
    +      master: RpcEndpointRef,
    +      masterWebUiUrl: String,
    +      masterAddress: RpcAddress) extends DeployMessage with RegisterWorkerResponse
    --- End diff --
    
    Checked the current codes. Unfortunately, we cannot remove this extra field. `master.address` and `masterAddress` are different.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    **[Test build #76392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76392/testReport)** for PR 17821 at commit [`f4699ad`](https://github.com/apache/spark/commit/f4699add54bb3fbe40d489aeabbf8f192e6ecb1f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17821#discussion_r114205566
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala ---
    @@ -80,8 +91,16 @@ private[deploy] object DeployMessages {
     
       sealed trait RegisterWorkerResponse
     
    -  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: String) extends DeployMessage
    -    with RegisterWorkerResponse
    +  /**
    +   * @param master the master ref
    +   * @param masterWebUiUrl the master Web UI address
    +   * @param masterAddress the master address used by the worker to connect. It should be
    +   *                      [[RegisterWorker.masterAddress]].
    +   */
    +  case class RegisteredWorker(
    +      master: RpcEndpointRef,
    +      masterWebUiUrl: String,
    +      masterAddress: RpcAddress) extends DeployMessage with RegisterWorkerResponse
    --- End diff --
    
    Can we avoid adding an extra field here? Perhaps just put the `masterAddress` in the `master` field.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    **[Test build #3684 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3684/testReport)** for PR 17821 at commit [`f4699ad`](https://github.com/apache/spark/commit/f4699add54bb3fbe40d489aeabbf8f192e6ecb1f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    **[Test build #76351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76351/testReport)** for PR 17821 at commit [`8ded9b1`](https://github.com/apache/spark/commit/8ded9b197cc7ef3cdb32858da385cf9f900deb7d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17821#discussion_r114446352
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -266,7 +289,8 @@ private[deploy] class Worker(
                 if (registerMasterFutures != null) {
                   registerMasterFutures.foreach(_.cancel(true))
                 }
    -            val masterAddress = masterRef.address
    +            val masterAddress =
    +              if (preferConfiguredMasterAddress) masterAddressToConnect.get else masterRef.address
    --- End diff --
    
    Right now `masterRef` and `masterAddressToConnect` are set at the same time. It's impossible unless we break something in future. It's better to fail rather than hiding the broken change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17821


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17821#discussion_r114444771
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala ---
    @@ -80,8 +91,16 @@ private[deploy] object DeployMessages {
     
       sealed trait RegisterWorkerResponse
     
    -  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: String) extends DeployMessage
    -    with RegisterWorkerResponse
    +  /**
    +   * @param master the master ref
    +   * @param masterWebUiUrl the master Web UI address
    +   * @param masterAddress the master address used by the worker to connect. It should be
    +   *                      [[RegisterWorker.masterAddress]].
    +   */
    +  case class RegisteredWorker(
    +      master: RpcEndpointRef,
    +      masterWebUiUrl: String,
    +      masterAddress: RpcAddress) extends DeployMessage with RegisterWorkerResponse
    --- End diff --
    
    Alright, that sounds good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    **[Test build #76351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76351/testReport)** for PR 17821 at commit [`8ded9b1`](https://github.com/apache/spark/commit/8ded9b197cc7ef3cdb32858da385cf9f900deb7d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  case class RegisteredWorker(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    cc @sameeragarwal 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17821#discussion_r114419583
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -266,7 +282,7 @@ private[deploy] class Worker(
                 if (registerMasterFutures != null) {
                   registerMasterFutures.foreach(_.cancel(true))
                 }
    -            val masterAddress = masterRef.address
    +            val masterAddress = masterAddressToConnect.get
    --- End diff --
    
    Added a new conf


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    **[Test build #76392 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76392/testReport)** for PR 17821 at commit [`f4699ad`](https://github.com/apache/spark/commit/f4699add54bb3fbe40d489aeabbf8f192e6ecb1f).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76351/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17821#discussion_r114445132
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -266,7 +289,8 @@ private[deploy] class Worker(
                 if (registerMasterFutures != null) {
                   registerMasterFutures.foreach(_.cancel(true))
                 }
    -            val masterAddress = masterRef.address
    +            val masterAddress =
    +              if (preferConfiguredMasterAddress) masterAddressToConnect.get else masterRef.address
    --- End diff --
    
    Perhaps it isn't an issue but do you think we should fall back to `masterRef.address` in case `masterAddressToConnect` isn't set (instead of throwing a generic scala exception)? Something along the lines of:
    
    ```scala
    val masterAddress = masterAddressToConnect match {
      case Some(master) if preferConfiguredMasterAddress => master
      case _ => masterRef.address
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    **[Test build #3684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3684/testReport)** for PR 17821 at commit [`f4699ad`](https://github.com/apache/spark/commit/f4699add54bb3fbe40d489aeabbf8f192e6ecb1f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/17821
  
    Thanks! Merging to master and 2.2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17821#discussion_r114206001
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -266,7 +282,7 @@ private[deploy] class Worker(
                 if (registerMasterFutures != null) {
                   registerMasterFutures.foreach(_.cancel(true))
                 }
    -            val masterAddress = masterRef.address
    +            val masterAddress = masterAddressToConnect.get
    --- End diff --
    
    How about we conf protect this change (with a default that still uses `masterRef`). If we can merge `master` and `masterAddress` as I suggested above, we can just add a conf on the master and the worker code can be largely unaffected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org