You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/02/09 16:21:42 UTC

[jira] [Commented] (FLINK-5759) Set an UncaughtExceptionHandler for all Thread Pools in JobManager

    [ https://issues.apache.org/jira/browse/FLINK-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15859733#comment-15859733 ] 

ASF GitHub Bot commented on FLINK-5759:
---------------------------------------

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/3290

    [FLINK-5759] [jobmanager] Set UncaughtExceptionHandlers for JobManager's Future and I/O thread pools

    Currently, the thread pools of the `JobManager` do not have any `UncaughtExceptionHandler`.
    
    While uncaught exceptions are rare (Flink handles exceptions aggressively in most places), when exceptions slip through in these threads (which execute future responses and delayed actions), the `JobManager` may be in an inconsistent state and not function properly any more.
    
    This pull request adds a handler that results in a process kill in the case of uncaught exceptions. Letting the JobManager be restarted by the respective cluster framework is the only guaranteed way to be safe.
    
    This also unifies the `ExecutorThreadFactory` and `NamedThreadFactory`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink uncaught_handlers

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3290.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3290
    
----
commit 3602631353dbdf230044db7fba1890600e648101
Author: Stephan Ewen <se...@apache.org>
Date:   2017-02-09T13:04:17Z

    [FLINK-5759] [jobmanager] Set UncaughtExceptionHandlers for JobManager's Future and I/O thread pools

----


> Set an UncaughtExceptionHandler for all Thread Pools in JobManager
> ------------------------------------------------------------------
>
>                 Key: FLINK-5759
>                 URL: https://issues.apache.org/jira/browse/FLINK-5759
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.2.0
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 1.3.0
>
>
> Currently, the thread pools of the {{JobManager}} do not have any {{UncaughtExceptionHandler}}.
> While uncaught exceptions are rare (Flink handles exceptions aggressively in most places), when exceptions slip through in these threads (which execute future responses and delayed actions), the JobManager may be in an inconsistent state and not function properly any more.
> We should add a handler that results in a process kill in the case of uncaught exceptions. Letting the JobManager be restarted by the respective cluster framework is the only guaranteed way to be safe.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)