You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Chun-Hung Hsiao (JIRA)" <ji...@apache.org> on 2019/05/03 03:47:00 UTC

[jira] [Commented] (MESOS-9667) Check failure when executor for task using resource provider resources subscribes before agent is registered

    [ https://issues.apache.org/jira/browse/MESOS-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832192#comment-16832192 ] 

Chun-Hung Hsiao commented on MESOS-9667:
----------------------------------------

Sorry I missed the reply. It seems to me that this patch is simple enough to backport to 1.7.x, and we don't change the related codepath in {{slave.cpp}} since 1.7.x.

> Check failure when executor for task using resource provider resources subscribes before agent is registered
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-9667
>                 URL: https://issues.apache.org/jira/browse/MESOS-9667
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.8.0
>            Reporter: Benjamin Bannier
>            Assignee: Benjamin Bannier
>            Priority: Blocker
>              Labels: foundations, mesosphere, mesosphere-dss-ga
>             Fix For: 1.8.0, 1.9.0
>
>
> When an executor for a task using resource provider resources subscribes before the agent has registered with the master, we trigger a fatal assertion,
> {code:java}
> Mar 21 13:42:47 agent1 mesos-agent[17277]: F0321 13:42:46.845535 17295 slave.cpp:8834] Check failed: 'resourceProviderManager.get()' Must be non NULL
> Mar 21 13:42:47 agent1 mesos-agent[17277]: *** Check failure stack trace: *{code}
> The reason for this failure is that we attempt to publish resources to the resource provider via the resource provider manager, but the resource provider manager is only created once the agent has registered with the master.
> As a workaround one can terminate the executors and their tasks, and let the framework relaunch the tasks (provided it supports that).
> A possible workaround could be to prevent such executors from subscribing until the resource provider manager is available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)