You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Hwanju Kim (Jira)" <ji...@apache.org> on 2020/10/11 15:37:00 UTC

[jira] [Commented] (FLINK-15156) Warn user if System.exit() is called in user code

    [ https://issues.apache.org/jira/browse/FLINK-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211949#comment-17211949 ] 

Hwanju Kim commented on FLINK-15156:
------------------------------------

FWIW, the following is what we have done:
 * Flink user security manager is added for general user sandbox checking, where currently only the exit is checked (others can be added later here).
 * The added one is forwarding all the checks but its overridden ones to previous security manager, if any (like decorator).
 * The security manager is set when JM and TM start (if configured, as described in the last bullet point).
 * Exit check has enabling/disabling point via a method only to affect user code, as Flink runtime needs to exit for some cases (e.g., fatal error).
 ** Once enabled, any thread spawned from the main thread inherits the enable flag.
 * What's enclosed by this enabled exit check is currently best-effort, not covering all the places where user code is involved. Main places are:
 ** main() in JM (currently for invokeInteractiveModeForExecution)
 ** StreamTask.invoke, triggerCheckpoint, cancel.
 * New exception, UserSystemExitException, is defined to be thrown when user code attempts to exit JVM. This has default message to warn the user.
 ** In main(), it's wrapped into ProgramInvocationException.
 ** In UDF, it fails the exiting task, thereby shipping the exception to JM triggering fail-over.
 * This security manager is only added if configuration (under security section) in flink-conf.yaml is enabled (disabled by default). The configuration is per check case (but currently only disallow-system-exit is available).

Please let me know if anyone wants to review the patch, or just discussion if anything does not make sense.

> Warn user if System.exit() is called in user code
> -------------------------------------------------
>
>                 Key: FLINK-15156
>                 URL: https://issues.apache.org/jira/browse/FLINK-15156
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Robert Metzger
>            Priority: Minor
>              Labels: starter
>
> It would make debugging Flink errors easier if we would intercept and log calls to System.exit() through the SecurityManager.
> A user recently had an error where the JobManager was shutting down because of a System.exit() in the user code: https://lists.apache.org/thread.html/b28dabcf3068d489f38399c456c80d48569fcdf74b15f8bb95d532d0%40%3Cuser.flink.apache.org%3E
> If I remember correctly, we had such issues before.
> I put this ticket into the "Runtime / Coordination" component, as it is mostly about improving the usability / debuggability in that area.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)