You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "hustfxj (JIRA)" <ji...@apache.org> on 2017/03/06 05:16:32 UTC

[jira] [Created] (SPARK-19831) Sending the heartbeat to master maybe blocked by other rpc messages

hustfxj created SPARK-19831:
-------------------------------

             Summary: Sending the heartbeat to master maybe blocked by other rpc messages
                 Key: SPARK-19831
                 URL: https://issues.apache.org/jira/browse/SPARK-19831
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.2.0
            Reporter: hustfxj


Cleaning the application may cost much time at worker, then it will block that  the worker send heartbeats master and rpc messages because the worker is extend *ThreadSafeRpcEndpoint*. So the master will think the worker is dead. If the worker has a driver, the driver will be scheduled by master again. So I think it is the bug on spark. I can solve this problem by followed suggests:

1. It had better  put the cleaning the application in a single asynchronous thread like 'cleanupThreadExecutor'. Thus it won't block other rpc messages like SendHeartbeat;

2. It had better not send the heartbeat master by rpc channel. Because any other rpc message may block the rpc channel. It had better send the heartbeat master at an asynchronous timing thread .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org