You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:16:27 UTC
[jira] [Resolved] (SPARK-19001) Worker will submit multiply
CleanWorkDir and SendHeartbeat task with each RegisterWorker response
[ https://issues.apache.org/jira/browse/SPARK-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-19001.
----------------------------------
Resolution: Incomplete
> Worker will submit multiply CleanWorkDir and SendHeartbeat task with each RegisterWorker response
> -------------------------------------------------------------------------------------------------
>
> Key: SPARK-19001
> URL: https://issues.apache.org/jira/browse/SPARK-19001
> Project: Spark
> Issue Type: Bug
> Components: Deploy
> Affects Versions: 1.6.1
> Reporter: liujianhui
> Priority: Minor
> Labels: bulk-closed
>
> h2. scene
> worker will submit multiply task of SendHeartbeat and CleanWorkDir if the worker register itself again, this code as follow
> {code}
> private def handleRegisterResponse(msg: RegisterWorkerResponse): Unit = synchronized {
> msg match {
> case RegisteredWorker(masterRef, masterWebUiUrl) =>
> logInfo("Successfully registered with master " + masterRef.address.toSparkURL)
> registered = true
> changeMaster(masterRef, masterWebUiUrl)
> forwordMessageScheduler.scheduleAtFixedRate(new Runnable {
> override def run(): Unit = Utils.tryLogNonFatalError {
> self.send(SendHeartbeat)
> }
> }, 0, HEARTBEAT_MILLIS, TimeUnit.MILLISECONDS)
> if (CLEANUP_ENABLED) {
> logInfo(
> s"Worker cleanup enabled; old application directories will be deleted in: $workDir")
> forwordMessageScheduler.scheduleAtFixedRate(new Runnable {
> override def run(): Unit = Utils.tryLogNonFatalError {
> self.send(WorkDirCleanup)
> }
> }, CLEANUP_INTERVAL_MILLIS, CLEANUP_INTERVAL_MILLIS, TimeUnit.MILLISECONDS)
> }
> {code}
> log as follow
> {code}
> 2016-12-20 20:23:30,030 | Successfully registered with master spark://polaris-1:8090|org.apache.spark.Logging$class.logInfo(Logging.scala:58)
> 2016-12-26 20:41:58,058 | Successfully registered with master spark://polaris-1:8090|org.apache.spark.Logging$class.logInfo(Logging.scala:58)
> {code}
> if the worker Send RegisterWorker event multiply many times when the master found the heartbeat of the worker expired, then it will submit task multiply times
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org