You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Guowei Ma (Jira)" <ji...@apache.org> on 2020/01/21 05:59:00 UTC

[jira] [Commented] (FLINK-2646) User functions should be able to differentiate between successful close and erroneous close

    [ https://issues.apache.org/jira/browse/FLINK-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019866#comment-17019866 ] 

Guowei Ma commented on FLINK-2646:
----------------------------------

I want to understand what specific scenarios the jira wants to resolve. After reading some mail/doc[1][2] and some scenarios I encountered. I summary it as below(correct me if I miss something):

Currently, the scenarios that _close_ interface could not be satisfied
 # The user wants to accelerate the failover process. Currently, some users implement the _close_ interface to flush the memory data to the external system because the job would deal with the bounded stream sometimes. However, it slows down the failover process because when canceling the task the Flink would also call the close method which might do some heavy i/o processing.
 # The user wants the exactly once semantics for the bounded stream. If the user implements the _close_ interface which commits the results some results would be committed multi-times because when failover occurs some messages would be replayed. If the user implements the _close_ interface which does not commit the result some results would be lost.

Because many users implement the _close_ interface to release the resources so we could not break this semantics that whenever a task is terminated the _close_ method should always be called.

If Flink provides an interface such as `_closeAtEndofStream_` I think we could resolve the second problem in most situations. However I think this also needs some other efforts such as dedupe the commit at the _close_ or using the _finalizeOnMaster_ callback.

 

[1] https://lists.apache.org/thread.html/4cf28a9fa3732dfdd9e673da6233c5288ca80b20d58cee130bf1c141%40%3Cuser.flink.apache.org%3E

[2] https://docs.google.com/document/d/1SXfhmeiJfWqi2ITYgCgAoSDUv5PNq1T8Zu01nR5Ebog/edit#

> User functions should be able to differentiate between successful close and erroneous close
> -------------------------------------------------------------------------------------------
>
>                 Key: FLINK-2646
>                 URL: https://issues.apache.org/jira/browse/FLINK-2646
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / DataStream
>    Affects Versions: 0.10.0
>            Reporter: Stephan Ewen
>            Assignee: Kostas Kloudas
>            Priority: Major
>              Labels: usability
>
> Right now, the {{close()}} method of rich functions is invoked in case of proper completion, and in case of canceling in case of error (to allow for cleanup).
> In certain cases, the user function needs to know why it is closed, whether the task completed in a regular fashion, or was canceled/failed.
> I suggest to add a method {{closeAfterFailure()}} to the {{RichFunction}}. By default, this method calls {{close()}}. The runtime is the changed to call {{close()}} as part of the regular execution and {{closeAfterFailure()}} in case of an irregular exit.
> Because by default all cases call {{close()}} the change would not be API breaking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)