You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/08/16 14:17:00 UTC
[jira] [Commented] (SPARK-40106) Task failure handlers should always run if the task failed

    [ https://issues.apache.org/jira/browse/SPARK-40106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580339#comment-17580339 ] 

Apache Spark commented on SPARK-40106:
--------------------------------------

User 'ryan-johnson-databricks' has created a pull request for this issue:
https://github.com/apache/spark/pull/37531

> Task failure handlers should always run if the task failed
> ----------------------------------------------------------
>
>                 Key: SPARK-40106
>                 URL: https://issues.apache.org/jira/browse/SPARK-40106
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.3.0
>            Reporter: Ryan Johnson
>            Priority: Major
>
> Today, if a task body succeeds, but a task completion listener fails, task failure listeners are not called -- even tho the task has indeed failed at that point.
> If a completion listener fails, and failure listeners were not previously invoked, we should invoke them before running the remaining completion listeners.
> Such a change would increase the utility of task listeners, especially ones intended to assist with task cleanup. 
> To give one arbitrary example, code like this appears at several places in the code (taken from {{executeTask}} method of FileFormatWriter.scala):
> {code:java}
>     try {
>       Utils.tryWithSafeFinallyAndFailureCallbacks(block = {
>         // Execute the task to write rows out and commit the task.
>         dataWriter.writeWithIterator(iterator)
>         dataWriter.commit()
>       })(catchBlock = {
>         // If there is an error, abort the task
>         dataWriter.abort()
>         logError(s"Job $jobId aborted.")
>       }, finallyBlock = {
>         dataWriter.close()
>       })
>     } catch {
>       case e: FetchFailedException =>
>         throw e
>       case f: FileAlreadyExistsException if SQLConf.get.fastFailFileFormatOutput =>
>         // If any output file to write already exists, it does not make sense to re-run this task.
>         // We throw the exception and let Executor throw ExceptionFailure to abort the job.
>         throw new TaskOutputFileAlreadyExistException(f)
>       case t: Throwable =>
>         throw QueryExecutionErrors.taskFailedWhileWritingRowsError(t)
>     }{code}
> If failure listeners were reliably called, the above idiom could potentially be factored out as two failure listeners plus a completion listener, and reused rather than duplicated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org