You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/30 10:24:00 UTC

[jira] [Commented] (FLINK-7294) mesos.resourcemanager.framework.role not working

    [ https://issues.apache.org/jira/browse/FLINK-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147030#comment-16147030 ] 

ASF GitHub Bot commented on FLINK-7294:
---------------------------------------

GitHub user bbayani opened a pull request:

    https://github.com/apache/flink/pull/4622

    [FLINK-7294]:[flink-mesos] mesos.resourcemanager.framework.role not working

    Jira Issue: FLINK-7294
    
    ## What is the purpose of the change
    This pull request uses role set in mesos.resourcemanager.framework.role and applies it for resources such as CPU, mem, ports.  Due to this framework considers resource offers coming from mesos-agents with specified role and is able to spawn up task-managers on mesos-agent running with specific role that role *. 
    
    ## Brief change log
      - Updated Utils.java to take in role information for constructing scalar / ranges resource values. 
      - Updated LaunchableMesosWorker to use framework role set in config.
      - Updated tests in LaunchCoordinatorTest.scala to pass role argument.
    
    ## Verifying this change
    Part of change is already covered by existing tests, such as  LaunchCoordinatorTest.scala.
    Also, manually verified the change by running a flink-mesos cluster with 1 job-manager and 3 task-managers. The flink was deployed on a mesos-cluster where mesos-workers were running with specific role and not role '*'.
    
    ## Does this pull request potentially affect one of the following parts:
      - Dependencies (does it add or upgrade a dependency):  no
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`:  no
      - The serializers: no 
      - The runtime per-record code paths (performance sensitive): no 
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: yes, it affects deployment on Mesos.
    
    ## Documentation
      - Does this pull request introduce a new feature? no
      - If yes, how is the feature documented? not documented
    
    @EronWright : PTAL. Thanks!

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bbayani/flink mesos_role_issue

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4622.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4622
    
----
commit 883c3866302b8073b60403b65c3aac85759b891c
Author: bbayani <bb...@cisco.com>
Date:   2017-08-30T09:54:54Z

    [FLINK-7294]:mesos.resourcemanager.framework.role not working

----


> mesos.resourcemanager.framework.role not working
> ------------------------------------------------
>
>                 Key: FLINK-7294
>                 URL: https://issues.apache.org/jira/browse/FLINK-7294
>             Project: Flink
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 1.3.1
>            Reporter: Bhumika Bayani
>            Assignee: Eron Wright 
>            Priority: Critical
>
> I am using the above said setting in flink-conf.yaml
> e.g.
> mesos.resourcemanager.framework.role: mesos_role_tasks
> I see a flink-scheduler registered in mesos/frameworks tab with above said role.
> But the scheduler fails to launch any tasks inspite of getting resource-offers from mesos-agents with correct role.
> The error seen is:
> {code}
> 2017-07-28 13:23:00,683 INFO  org.apache.flink.mesos.runtime.clusterframework.MesosFlinkResourceManager  - Mesos task taskmanager-03768 failed, with a TaskManager in launch or registration. State: TASK_ERROR Reason: REASON_TASK_INVALID (Task uses more resources cpus(\*):1; mem(\*):1024; ports(\*):[4006-4007] than available cpus(mesos_role_tasks):7.4; mem(mesos_role_tasks):45876; ports(mesos_role_tasks):[4002-4129, 4131-4380, 4382-4809, 4811-4957, 4959-4966, 4968-4979, 4981-5049, 31000-31196, 31198-31431, 31433-31607, 31609-32000]; ephemeral_storage(mesos_role_tasks):37662; efs_storage(mesos_role_tasks):8.79609e+12; disk(mesos_role_tasks):5115)
> {code}
> The request is made for resources with * role. We do not have mesos running anywhere with * role. Thus task manager never come up. 
> Am I missing any configuration?
> I am using flink version 1.3.1



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)