You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2012/07/23 19:41:33 UTC
[jira] [Created] (MESOS-239) Allocator doesn't handle framework
failover correctly
Vinod Kone created MESOS-239:
--------------------------------
Summary: Allocator doesn't handle framework failover correctly
Key: MESOS-239
URL: https://issues.apache.org/jira/browse/MESOS-239
Project: Mesos
Issue Type: Bug
Reporter: Vinod Kone
This cropped up during one of AB tests.
The scenario: A framework fails over. The allocator throws an exception when its trying to add the framework. This is because the framework has been deactivated, but the allocated[frameworkId] is never erased.
I0721 00:41:13.154080 43396 dominant_share_allocator.cpp:167] Deactivated framework 201207210040-2081170186-58055-43387-0000
W0721 00:41:14.272461 43392 master.cpp:77] No whitelist given. Advertising offers for all slaves
2012-07-21 00:41:14,538:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got ping response in 0 ms
2012-07-21 00:41:17,875:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got ping response in 0 ms
.......
.......
.......
I0721 00:42:09.721727 43396 master.cpp:614] Re-registering framework 201207210040-2081170186-58055-43387-0000 at scheduler(1)@10.35.12.124:57793
I0721 00:42:09.721822 43396 master.cpp:633] Framework 201207210040-2081170186-58055-43387-0000 failed over
F0721 00:42:09.722185 43397 dominant_share_allocator.cpp:143] Check failed: !allocated.contains(frameworkId)
*** Check failure stack trace: ***
@ 0x7f5874ef7fdd google::LogMessage::Fail()
@ 0x7f5874efdc47 google::LogMessage::SendToLog()
@ 0x7f5874ef988c google::LogMessage::Flush()
@ 0x7f5874ef9af6 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f5874c75c1d mesos::internal::master::DominantShareAllocator::frameworkAdded()
@ 0x7f5874bd16be std::tr1::_Mem_fn<>::operator()()
@ 0x7f5874bd55b2 std::tr1::_Bind<>::operator()<>()
@ 0x7f5874bd55e3 std::tr1::_Function_handler<>::_M_invoke()
@ 0x7f5874be139f std::tr1::function<>::operator()()
@ 0x7f5874bf7560 process::internal::vdispatcher<>()
@ 0x7f5874bf8310 std::tr1::_Bind<>::operator()<>()
@ 0x7f5874bf8365 std::tr1::_Function_handler<>::_M_invoke()
@ 0x7f5874e3bf4f std::tr1::function<>::operator()()
@ 0x7f5874e0c5db process::ProcessBase::visit()
@ 0x7f5874e1dc50 process::DispatchEvent::visit()
@ 0x7f5874b71ffc process::ProcessBase::serve()
@ 0x7f5874e1656f process::ProcessManager::resume()
@ 0x7f5874e16dba process::schedule()
@ 0x316120673d (unknown)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (MESOS-239) Allocator doesn't handle framework
failover correctly
Posted by "Benjamin Hindman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MESOS-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Hindman closed MESOS-239.
----------------------------------
> Allocator doesn't handle framework failover correctly
> -----------------------------------------------------
>
> Key: MESOS-239
> URL: https://issues.apache.org/jira/browse/MESOS-239
> Project: Mesos
> Issue Type: Bug
> Reporter: Vinod Kone
> Assignee: Benjamin Hindman
>
> This cropped up during one of AB tests.
> The scenario: A framework fails over. The allocator throws an exception when its trying to add the framework. This is because the framework has been deactivated, but the allocated[frameworkId] is never erased.
> I0721 00:41:13.154080 43396 dominant_share_allocator.cpp:167] Deactivated framework 201207210040-2081170186-58055-43387-0000
> W0721 00:41:14.272461 43392 master.cpp:77] No whitelist given. Advertising offers for all slaves
> 2012-07-21 00:41:14,538:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got ping response in 0 ms
> 2012-07-21 00:41:17,875:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got ping response in 0 ms
> .......
> .......
> .......
> I0721 00:42:09.721727 43396 master.cpp:614] Re-registering framework 201207210040-2081170186-58055-43387-0000 at scheduler(1)@10.35.12.124:57793
> I0721 00:42:09.721822 43396 master.cpp:633] Framework 201207210040-2081170186-58055-43387-0000 failed over
> F0721 00:42:09.722185 43397 dominant_share_allocator.cpp:143] Check failed: !allocated.contains(frameworkId)
> *** Check failure stack trace: ***
> @ 0x7f5874ef7fdd google::LogMessage::Fail()
> @ 0x7f5874efdc47 google::LogMessage::SendToLog()
> @ 0x7f5874ef988c google::LogMessage::Flush()
> @ 0x7f5874ef9af6 google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f5874c75c1d mesos::internal::master::DominantShareAllocator::frameworkAdded()
> @ 0x7f5874bd16be std::tr1::_Mem_fn<>::operator()()
> @ 0x7f5874bd55b2 std::tr1::_Bind<>::operator()<>()
> @ 0x7f5874bd55e3 std::tr1::_Function_handler<>::_M_invoke()
> @ 0x7f5874be139f std::tr1::function<>::operator()()
> @ 0x7f5874bf7560 process::internal::vdispatcher<>()
> @ 0x7f5874bf8310 std::tr1::_Bind<>::operator()<>()
> @ 0x7f5874bf8365 std::tr1::_Function_handler<>::_M_invoke()
> @ 0x7f5874e3bf4f std::tr1::function<>::operator()()
> @ 0x7f5874e0c5db process::ProcessBase::visit()
> @ 0x7f5874e1dc50 process::DispatchEvent::visit()
> @ 0x7f5874b71ffc process::ProcessBase::serve()
> @ 0x7f5874e1656f process::ProcessManager::resume()
> @ 0x7f5874e16dba process::schedule()
> @ 0x316120673d (unknown)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MESOS-239) Allocator doesn't handle framework
failover correctly
Posted by "Benjamin Hindman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MESOS-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Hindman resolved MESOS-239.
------------------------------------
Resolution: Fixed
Assignee: Benjamin Hindman
Committed in revision 1364826.
> Allocator doesn't handle framework failover correctly
> -----------------------------------------------------
>
> Key: MESOS-239
> URL: https://issues.apache.org/jira/browse/MESOS-239
> Project: Mesos
> Issue Type: Bug
> Reporter: Vinod Kone
> Assignee: Benjamin Hindman
>
> This cropped up during one of AB tests.
> The scenario: A framework fails over. The allocator throws an exception when its trying to add the framework. This is because the framework has been deactivated, but the allocated[frameworkId] is never erased.
> I0721 00:41:13.154080 43396 dominant_share_allocator.cpp:167] Deactivated framework 201207210040-2081170186-58055-43387-0000
> W0721 00:41:14.272461 43392 master.cpp:77] No whitelist given. Advertising offers for all slaves
> 2012-07-21 00:41:14,538:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got ping response in 0 ms
> 2012-07-21 00:41:17,875:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got ping response in 0 ms
> .......
> .......
> .......
> I0721 00:42:09.721727 43396 master.cpp:614] Re-registering framework 201207210040-2081170186-58055-43387-0000 at scheduler(1)@10.35.12.124:57793
> I0721 00:42:09.721822 43396 master.cpp:633] Framework 201207210040-2081170186-58055-43387-0000 failed over
> F0721 00:42:09.722185 43397 dominant_share_allocator.cpp:143] Check failed: !allocated.contains(frameworkId)
> *** Check failure stack trace: ***
> @ 0x7f5874ef7fdd google::LogMessage::Fail()
> @ 0x7f5874efdc47 google::LogMessage::SendToLog()
> @ 0x7f5874ef988c google::LogMessage::Flush()
> @ 0x7f5874ef9af6 google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f5874c75c1d mesos::internal::master::DominantShareAllocator::frameworkAdded()
> @ 0x7f5874bd16be std::tr1::_Mem_fn<>::operator()()
> @ 0x7f5874bd55b2 std::tr1::_Bind<>::operator()<>()
> @ 0x7f5874bd55e3 std::tr1::_Function_handler<>::_M_invoke()
> @ 0x7f5874be139f std::tr1::function<>::operator()()
> @ 0x7f5874bf7560 process::internal::vdispatcher<>()
> @ 0x7f5874bf8310 std::tr1::_Bind<>::operator()<>()
> @ 0x7f5874bf8365 std::tr1::_Function_handler<>::_M_invoke()
> @ 0x7f5874e3bf4f std::tr1::function<>::operator()()
> @ 0x7f5874e0c5db process::ProcessBase::visit()
> @ 0x7f5874e1dc50 process::DispatchEvent::visit()
> @ 0x7f5874b71ffc process::ProcessBase::serve()
> @ 0x7f5874e1656f process::ProcessManager::resume()
> @ 0x7f5874e16dba process::schedule()
> @ 0x316120673d (unknown)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira