You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Yan Xu (JIRA)" <ji...@apache.org> on 2015/09/09 02:19:46 UTC

[jira] [Updated] (MESOS-3397) sorter.cpp: Check failed: total.resources.contains(slaveId)

     [ https://issues.apache.org/jira/browse/MESOS-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Xu updated MESOS-3397:
--------------------------
    Description: 
Observed in production.

{noformat:title=}
F0908 23:21:10.635751  6884 sorter.cpp:213] Check failed: total.resources.contains(slaveId)
*** Check failure stack trace: ***
    @     0x7f772cdb10bd  google::LogMessage::Fail()
    @     0x7f772cdb2f04  google::LogMessage::SendToLog()
    @     0x7f772cdb0cac  google::LogMessage::Flush()
    @     0x7f772cdb37f9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f772c8162d0  mesos::internal::master::allocator::DRFSorter::remove()
    @     0x7f772c6f61bc  mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework()
    @     0x7f772cd61f09  process::ProcessManager::resume()
    @     0x7f772cd6220f  process::internal::schedule()
    @     0x7f772ce73610  execute_native_thread_routine
    @     0x7f772bcb883d  start_thread
    @     0x7f772b4aafdd  clone
{noformat}

This is following a framework removal:
{noformat:title=}
I0908 23:21:10.619640  6884 master.cpp:4261] Framework failover timeout, removing framework 20150813-182946-1685138442-5050-58479-0425 (Some Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952
{noformat}

  was:
Observed in production.

{noformat:title=}
F0908 23:21:10.635751  6884 sorter.cpp:213] Check failed: total.resources.contains(slaveId)
*** Check failure stack trace: ***
    @     0x7f772cdb10bd  google::LogMessage::Fail()
    @     0x7f772cdb2f04  google::LogMessage::SendToLog()
    @     0x7f772cdb0cac  google::LogMessage::Flush()
    @     0x7f772cdb37f9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f772c8162d0  mesos::internal::master::allocator::DRFSorter::remove()
    @     0x7f772c6f61bc  mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework()
    @     0x7f772cd61f09  process::ProcessManager::resume()
    @     0x7f772cd6220f  process::internal::schedule()
    @     0x7f772ce73610  execute_native_thread_routine
    @     0x7f772bcb883d  start_thread
    @     0x7f772b4aafdd  clone
{noformat}


> sorter.cpp: Check failed: total.resources.contains(slaveId)
> -----------------------------------------------------------
>
>                 Key: MESOS-3397
>                 URL: https://issues.apache.org/jira/browse/MESOS-3397
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.24.0
>            Reporter: Yan Xu
>
> Observed in production.
> {noformat:title=}
> F0908 23:21:10.635751  6884 sorter.cpp:213] Check failed: total.resources.contains(slaveId)
> *** Check failure stack trace: ***
>     @     0x7f772cdb10bd  google::LogMessage::Fail()
>     @     0x7f772cdb2f04  google::LogMessage::SendToLog()
>     @     0x7f772cdb0cac  google::LogMessage::Flush()
>     @     0x7f772cdb37f9  google::LogMessageFatal::~LogMessageFatal()
>     @     0x7f772c8162d0  mesos::internal::master::allocator::DRFSorter::remove()
>     @     0x7f772c6f61bc  mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework()
>     @     0x7f772cd61f09  process::ProcessManager::resume()
>     @     0x7f772cd6220f  process::internal::schedule()
>     @     0x7f772ce73610  execute_native_thread_routine
>     @     0x7f772bcb883d  start_thread
>     @     0x7f772b4aafdd  clone
> {noformat}
> This is following a framework removal:
> {noformat:title=}
> I0908 23:21:10.619640  6884 master.cpp:4261] Framework failover timeout, removing framework 20150813-182946-1685138442-5050-58479-0425 (Some Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)