You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Zhenqiu Huang (Jira)" <ji...@apache.org> on 2021/01/15 08:08:00 UTC

[jira] [Comment Edited] (FLINK-20833) Expose pluggable interface for exception analysis and metrics reporting in Execution Graph

    [ https://issues.apache.org/jira/browse/FLINK-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265795#comment-17265795 ] 

Zhenqiu Huang edited comment on FLINK-20833 at 1/15/21, 8:07 AM:
-----------------------------------------------------------------

[~rmetzger]
Thanks for these suggestions. 
1) I think the name of ExceptionListener is more reasonable. 
2) Yes, the implementation can be loaded in service provider. As long as the implementation is in the flink's classpath, it can be loaded.
3) I prefer to use Flink's metrics system.

I did a poc on the agreement we have. Please review it. If we agree on the basic interface, I will further add test cases to enhance the PR.
https://github.com/HuangZhenQiu/flink/commit/903c7746217c0cb91a2eff15a72de873ad48a5e7









was (Author: zhenqiuhuang):
[~rmetzger]
Thanks for these suggestions. 
1) I think the name of ExceptionListener is more reasonable. 
2) Yes, the implementation can be loaded in service provider. As long as the implementation is in the flink's classpath, it can be loaded.
3) I prefer to use Flink's metrics system.

I did a poc on the agreement we have. Please review it.
https://github.com/HuangZhenQiu/flink/commit/903c7746217c0cb91a2eff15a72de873ad48a5e7








> Expose pluggable interface for  exception analysis and metrics reporting in Execution Graph
> -------------------------------------------------------------------------------------------
>
>                 Key: FLINK-20833
>                 URL: https://issues.apache.org/jira/browse/FLINK-20833
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.12.0
>            Reporter: Zhenqiu Huang
>            Priority: Minor
>
> For platform users of Apache flink, people usually want to classify the failure reason( for example user code, networking, dependencies and etc) for Flink jobs and emit metrics for those analyzed results. So that platform can provide an accurate value for system reliability by distinguishing the failure due to user logic from the system issues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)