You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "陈梓立 (JIRA)" <ji...@apache.org> on 2018/07/09 06:57:00 UTC

[jira] [Commented] (FLINK-9778) Remove SlotRequest timeout

    [ https://issues.apache.org/jira/browse/FLINK-9778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536589#comment-16536589 ] 

陈梓立 commented on FLINK-9778:
----------------------------

I will pull a PR a little later, and wonder if we could remove timeout mechanism cleanly.

It would decrease failover times but I am not sure whether I've considered all SlotRequest exceptions without timeout mechanism are handled.

> Remove SlotRequest timeout
> --------------------------
>
>                 Key: FLINK-9778
>                 URL: https://issues.apache.org/jira/browse/FLINK-9778
>             Project: Flink
>          Issue Type: Improvement
>          Components: JobManager, ResourceManager, TaskManager
>            Reporter: 陈梓立
>            Assignee: 陈梓立
>            Priority: Major
>             Fix For: 1.5.1
>
>
> Now when SlotPool(JobMaster) requestSlotsFromResourceManager, it checks timeout, if RM does not response in 5 minutes, JM fails the request and re-request it. It does little good and cause flink request resource less exactly.
> I would propose remove this timeout mechanism, that is, a SlotRequest does no more timeout. And our current failure tolerant mechanism would handle SlotRequest exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)