You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/03 09:28:01 UTC

[jira] [Commented] (FLINK-9293) SlotPool should check slot id when accepting a slot offer with existing allocation id

    [ https://issues.apache.org/jira/browse/FLINK-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462182#comment-16462182 ] 

ASF GitHub Bot commented on FLINK-9293:
---------------------------------------

GitHub user shuai-xu opened a pull request:

    https://github.com/apache/flink/pull/5951

    [FLINK-9293] [runtime] SlotPool should check slot id when accepting a slot offer with existing allocation id

    
    ## What is the purpose of the change
    
    *This pull request fix that job master will accept multi slot offers with same allocation id and make the later slots leak.*
    
    ## Verifying this change
    
    This change added tests and can be verified as follows:
    
    *(example:)*
      - *Run the SlotPoolTest*
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
      - The S3 file system connector: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes / no)
      - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shuai-xu/flink jira-9293

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5951.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5951
    
----
commit 22ae227e2a19ddf15890dcce779536687328d7ac
Author: shuai.xus <sh...@...>
Date:   2018-05-03T09:13:08Z

    [FLINK-9293] [runtime] SlotPool should check slot id when accepting a slot offer with existing allocation id

----


> SlotPool should check slot id when accepting a slot offer with existing allocation id
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-9293
>                 URL: https://issues.apache.org/jira/browse/FLINK-9293
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.5.0
>            Reporter: shuai.xu
>            Assignee: shuai.xu
>            Priority: Major
>              Labels: flip-6
>
> For flip-6, there may be two or more slot assigned to the same slot allocation. For example, taskExecutor1 register, and assign allocationID1 to its slot1, but from taskExecutor1 side, the registeration timeout, and it register again, RM will fail the allocationID1 and assign slot2 on taskExecutor2 to it. but taskExecutor1 has already accept the allocationID1. 
> So taskExecutor1 and taskExecutor2 both offer slot to jobmaster with the allocationID1. Now slot pool just accept all the slot offer, and this may one slot leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)