You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/04/29 08:31:12 UTC

[jira] [Assigned] (SPARK-14958) Failed task hangs if error is encountered when getting task result

     [ https://issues.apache.org/jira/browse/SPARK-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-14958:
------------------------------------

    Assignee:     (was: Apache Spark)

> Failed task hangs if error is encountered when getting task result
> ------------------------------------------------------------------
>
>                 Key: SPARK-14958
>                 URL: https://issues.apache.org/jira/browse/SPARK-14958
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Rui Li
>
> In {{TaskResultGetter}}, if we get an error when deserialize {{TaskEndReason}}, TaskScheduler won't have a chance to handle the failed task and the task just hangs.
> {code}
>   def enqueueFailedTask(taskSetManager: TaskSetManager, tid: Long, taskState: TaskState,
>     serializedData: ByteBuffer) {
>     var reason : TaskEndReason = UnknownReason
>     try {
>       getTaskResultExecutor.execute(new Runnable {
>         override def run(): Unit = Utils.logUncaughtExceptions {
>           val loader = Utils.getContextOrSparkClassLoader
>           try {
>             if (serializedData != null && serializedData.limit() > 0) {
>               reason = serializer.get().deserialize[TaskEndReason](
>                 serializedData, loader)
>             }
>           } catch {
>             case cnd: ClassNotFoundException =>
>               // Log an error but keep going here -- the task failed, so not catastrophic
>               // if we can't deserialize the reason.
>               logError(
>                 "Could not deserialize TaskEndReason: ClassNotFound with classloader " + loader)
>             case ex: Exception => {}
>           }
>           scheduler.handleFailedTask(taskSetManager, tid, taskState, reason)
>         }
>       })
>     } catch {
>       case e: RejectedExecutionException if sparkEnv.isStopped =>
>         // ignore it
>     }
>   }
> {code}
> In my specific case, I got a NoClassDefFoundError and the failed task hangs forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org