You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Hwanju Kim (Jira)" <ji...@apache.org> on 2020/10/11 15:37:00 UTC
[jira] [Commented] (FLINK-15156) Warn user if System.exit() is
called in user code
[ https://issues.apache.org/jira/browse/FLINK-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211949#comment-17211949 ]
Hwanju Kim commented on FLINK-15156:
------------------------------------
FWIW, the following is what we have done:
* Flink user security manager is added for general user sandbox checking, where currently only the exit is checked (others can be added later here).
* The added one is forwarding all the checks but its overridden ones to previous security manager, if any (like decorator).
* The security manager is set when JM and TM start (if configured, as described in the last bullet point).
* Exit check has enabling/disabling point via a method only to affect user code, as Flink runtime needs to exit for some cases (e.g., fatal error).
** Once enabled, any thread spawned from the main thread inherits the enable flag.
* What's enclosed by this enabled exit check is currently best-effort, not covering all the places where user code is involved. Main places are:
** main() in JM (currently for invokeInteractiveModeForExecution)
** StreamTask.invoke, triggerCheckpoint, cancel.
* New exception, UserSystemExitException, is defined to be thrown when user code attempts to exit JVM. This has default message to warn the user.
** In main(), it's wrapped into ProgramInvocationException.
** In UDF, it fails the exiting task, thereby shipping the exception to JM triggering fail-over.
* This security manager is only added if configuration (under security section) in flink-conf.yaml is enabled (disabled by default). The configuration is per check case (but currently only disallow-system-exit is available).
Please let me know if anyone wants to review the patch, or just discussion if anything does not make sense.
> Warn user if System.exit() is called in user code
> -------------------------------------------------
>
> Key: FLINK-15156
> URL: https://issues.apache.org/jira/browse/FLINK-15156
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Robert Metzger
> Priority: Minor
> Labels: starter
>
> It would make debugging Flink errors easier if we would intercept and log calls to System.exit() through the SecurityManager.
> A user recently had an error where the JobManager was shutting down because of a System.exit() in the user code: https://lists.apache.org/thread.html/b28dabcf3068d489f38399c456c80d48569fcdf74b15f8bb95d532d0%40%3Cuser.flink.apache.org%3E
> If I remember correctly, we had such issues before.
> I put this ticket into the "Runtime / Coordination" component, as it is mostly about improving the usability / debuggability in that area.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)