You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zsxwing <gi...@git.apache.org> on 2015/12/11 03:05:48 UTC

[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

GitHub user zsxwing opened a pull request:

    https://github.com/apache/spark/pull/10261

    [SPARK-12267][Core]Store the remote RpcEnv address to send the correct disconnetion message

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zsxwing/spark SPARK-12267

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10261
    
----
commit f84e3254c56a8617765d68086cee1220f4a22dfa
Author: Shixiong Zhu <sh...@databricks.com>
Date:   2015-12-11T02:05:17Z

    Store the remote RpcEnv address to send the correct disconnetion message

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163855821
  
    **[Test build #47561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47561/consoleFull)** for PR 10261 at commit [`f84e325`](https://github.com/apache/spark/commit/f84e3254c56a8617765d68086cee1220f4a22dfa).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163855911
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47561/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164058531
  
    **[Test build #47588 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47588/consoleFull)** for PR 10261 at commit [`3b37070`](https://github.com/apache/spark/commit/3b370702fc0be2de7dcbee1fa8c1c3ac3cc60cb5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164231271
  
    @vanzin I created [SPARK-12308](https://issues.apache.org/jira/browse/SPARK-12308) to remind us of your approach


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163819194
  
    hey guys, I know this is still in flight, but we are pretty behind on the release.  I'm going to cut RC2 and we can always make a judgement call about doing an RC3 once this is fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164036553
  
    **[Test build #47588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47588/consoleFull)** for PR 10261 at commit [`3b37070`](https://github.com/apache/spark/commit/3b370702fc0be2de7dcbee1fa8c1c3ac3cc60cb5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164059097
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164177137
  
    BTW I'm fine with this change for 1.6; we can revisit this later to avoid having to keep extra state in the RPC layer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164107825
  
    > I saw your following comment...But I don't get it.
    
    That's because we took opposite approaches. Your PR's approach is "if this connection says the sender address is a listening socket, then also consider that address when sending events about the remote process".
    
    My PR takes the opposite approach: the address of the remote process is always the address of the socket used to connect, regardless of whether its also listening in another socket.
    
    I think my approach is in the end more correct, but requires more code to fix existing code. In my view, `RpcCallContext.senderAddress` should be the address of the socket that sent the message, not the address the remote process is listening on.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163810210
  
    CC @vanzin  could you take a look? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10261


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163855910
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164037328
  
    @vanzin 
    
    I saw your following comment in https://github.com/vanzin/spark/commit/3848bf5e5bee4aa132aa001baab41dc58d39e5c5#diff-acd05d6d379b6ef6ccf36bd3db5614f6R69
    ```
    Because the messages used to register workers and applications were one-way
    messages, though, "receive" was used, and that information was not available.
    So now those messages are send using "ask", which looks a bit awkward but is
    simpler than changing the RpcEnv API so that the client address is available
    in the "receive" method.
    ```
    But I don't get it.
    
    The communication in master-worker, worker-driver and  master-driver are all in non-client mode. So for one-way messages, `RequestMessage.senderAddress` is the RpcEnv listening address. If we maintain the address relation like this PR, RpcEndpoint doesn't need to worry about the address issue even for `one-way` messages. Right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163816221
  
    **[Test build #47561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47561/consoleFull)** for PR 10261 at commit [`f84e325`](https://github.com/apache/spark/commit/f84e3254c56a8617765d68086cee1220f4a22dfa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164177066
  
    Akka doesn't have this problem as far as I understand the code; also because the akka rpc backend never runs in "client mode" (that's a netty rpc-only thing).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164110300
  
    I see. However, for 1.6, we still need to support Akka RPC. Is it possible to get the client address from `ActorRef`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163816892
  
    So, I don't think this is gonna work, and that's why my change is taking longer than I hoped to finish... while this might fix the immediate problem listed in the bug, there are other problems related to this that need to be fixed.
    
    Basically, standalone master HA is currently broken, and because there are no unit tests at all for that part of the code, nobody caught it...
    
    I'm trying to fix it, I think I almost got it, have one last exception to get rid of.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10261#discussion_r47314731
  
    --- Diff: core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala ---
    @@ -580,6 +583,12 @@ private[netty] class NettyRpcHandler(
           // Create a new message with the socket address of the client as the sender.
           RequestMessage(clientAddr, requestMessage.receiver, requestMessage.content)
         } else {
    +      // The remote RpcEnv listens to some port, we should also fire a RemoteProcessConnected for
    +      // the listening address
    +      val remoteEnvAddress = requestMessage.senderAddress
    +      if (remoteAddresses.putIfAbsent(clientAddr, remoteEnvAddress) == null) {
    +        dispatcher.postToAll(RemoteProcessConnected(remoteEnvAddress))
    --- End diff --
    
    We don't need to concern about multiple network messages with different addresses since the codes already handle the bad addresses.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164059104
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47588/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-164229994
  
    > Akka doesn't have this problem as far as I understand the code; also because the akka rpc backend never runs in "client mode" (that's a netty rpc-only thing).
    
    I didn't mean that. I just want to mention that if we want to change the semantics of `senderAddress`, we need to update the Akka RPC as well.
    
    > BTW I'm fine with this change for 1.6; we can revisit this later to avoid having to keep extra state in the RPC layer.
    
    Great. I'm going to merge this one to master and 1.6. Let's revisit this later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12267][Core]Store the remote RpcEnv add...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/10261#issuecomment-163817812
  
    Cool. I will close this PR.
    
    On Thu, Dec 10, 2015 at 6:28 PM Marcelo Vanzin <no...@github.com>
    wrote:
    
    > So, I don't think this is gonna work, and that's why my change is taking
    > longer than I hoped to finish... while this might fix the immediate problem
    > listed in the bug, there are other problems related to this that need to be
    > fixed.
    >
    > Basically, standalone master HA is currently broken, and because there are
    > no unit tests at all for that part of the code, nobody caught it...
    >
    > I'm trying to fix it, I think I almost got it, have one last exception to
    > get rid of.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/10261#issuecomment-163816892>.
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org