You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Yan Xu (JIRA)" <ji...@apache.org> on 2016/10/25 22:46:58 UTC

[jira] [Updated] (MESOS-6482) Master check failure when marking an agent unreachable

     [ https://issues.apache.org/jira/browse/MESOS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Xu updated MESOS-6482:
--------------------------
    Description: 
{noformat:title=}
I1025 16:34:55.423038 44118 master.cpp:6006] Marked agent 8e219f7a-06c1-4009-9440-1a33b3e39be5-S473 (x.y.z.com) unreachable: health check timed out
F1025 16:34:55.423632 44118 master.cpp:6036] Check failed: frameworks.recovered.contains(frameworkId) 
{noformat}

Both the master and the agent are on 1.1.

{code:title=the context}
  foreachkey (const FrameworkID& frameworkId, utils::copy(slave->tasks)) {
    Framework* framework = getFramework(frameworkId);

    // If the framework has not yet re-registered after master failover,
    // its FrameworkInfo will be in the `recovered` collection. Note that
    // if the master knows about a task, its FrameworkInfo must appear in
    // either the `registered` or `recovered` collections.
    FrameworkInfo frameworkInfo;

    if (framework == nullptr) {
      CHECK(frameworks.recovered.contains(frameworkId));
      frameworkInfo = frameworks.recovered[frameworkId];
    } else {
      frameworkInfo = framework->info;
    }

    ...
{code}

  was:
{noformat:title=}
I1025 16:34:55.423038 44118 master.cpp:6006] Marked agent 8e219f7a-06c1-4009-9440-1a33b3e39be5-S473 (x.y.z.com) unreachable: health check timed out
F1025 16:34:55.423632 44118 master.cpp:6036] Check failed: frameworks.recovered.contains(frameworkId) 
{noformat}

Both the master and the agent are on 1.1.


> Master check failure when marking an agent unreachable
> ------------------------------------------------------
>
>                 Key: MESOS-6482
>                 URL: https://issues.apache.org/jira/browse/MESOS-6482
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.1.0
>            Reporter: Yan Xu
>            Priority: Blocker
>
> {noformat:title=}
> I1025 16:34:55.423038 44118 master.cpp:6006] Marked agent 8e219f7a-06c1-4009-9440-1a33b3e39be5-S473 (x.y.z.com) unreachable: health check timed out
> F1025 16:34:55.423632 44118 master.cpp:6036] Check failed: frameworks.recovered.contains(frameworkId) 
> {noformat}
> Both the master and the agent are on 1.1.
> {code:title=the context}
>   foreachkey (const FrameworkID& frameworkId, utils::copy(slave->tasks)) {
>     Framework* framework = getFramework(frameworkId);
>     // If the framework has not yet re-registered after master failover,
>     // its FrameworkInfo will be in the `recovered` collection. Note that
>     // if the master knows about a task, its FrameworkInfo must appear in
>     // either the `registered` or `recovered` collections.
>     FrameworkInfo frameworkInfo;
>     if (framework == nullptr) {
>       CHECK(frameworks.recovered.contains(frameworkId));
>       frameworkInfo = frameworks.recovered[frameworkId];
>     } else {
>       frameworkInfo = framework->info;
>     }
>     ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)