You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Till Toenshoff (JIRA)" <ji...@apache.org> on 2017/02/13 01:45:42 UTC

[jira] [Commented] (MESOS-2842) Update FrameworkInfo.principal on framework re-registration

    [ https://issues.apache.org/jira/browse/MESOS-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15863104#comment-15863104 ] 

Till Toenshoff commented on MESOS-2842:
---------------------------------------

This is what this looks like when coming across this issue;

{noformat}
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 01:38:28.419044  2809 master.cpp:2783] Subscribing framework integration_test with checkpointing enabled and capabilities [ PARTITION_AWARE ]
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 01:38:28.419072  2809 master.cpp:2861] Updating info for framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: W0213 01:38:28.419083  2809 master.hpp:2486] Cannot update FrameworkInfo.role to '*' for framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009. Check MESOS-703
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: W0213 01:38:28.419091  2809 master.hpp:2497] Cannot update FrameworkInfo.principal to 'alice' for framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009. Check MESOS-703
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 01:38:28.419111  2809 master.cpp:2874] Framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009 (integration_test) at scheduler-188c0a58-9b44-4e2b-b133-a7c15b37fc55@127.0.0.1:41805 failed over
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 01:38:28.419245  2809 hierarchical.cpp:358] Activated framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 01:38:28.419543  2809 master.cpp:6664] Sending 1 offers to framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009 (integration_test) at scheduler-7fff5d25-a121-48bf-8849-1948b161d729@127.0.0.1:46530
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: F0213 01:38:28.426944  2809 master.cpp:1446] Check failed: metrics->frameworks.contains(principal.get())
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: *** Check failure stack trace: ***
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb678b831ad  google::LogMessage::Fail()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb678b84fdd  google::LogMessage::SendToLog()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb678b82d9c  google::LogMessage::Flush()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb678b858d9  google::LogMessageFatal::~LogMessageFatal()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb6780453dd  mesos::internal::master::Master::visit()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb678af7ca1  process::ProcessManager::resume()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb678b00ba7  _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb676f90230  (unknown)
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb6767aedc5  start_thread
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     0x7fb6764dd73d  __clone
{noformat}

> Update FrameworkInfo.principal on framework re-registration
> -----------------------------------------------------------
>
>                 Key: MESOS-2842
>                 URL: https://issues.apache.org/jira/browse/MESOS-2842
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Vinod Kone
>              Labels: security
>
> From the design doc:
> This is a bit involved because ‘principal’ is used for authentication and rate limiting.
> The authentication part is straightforward because a framework with updated ‘principal’ should authenticate with the new ‘principal’ before being allowed to re-register. The ‘authenticated’ map already gets updated when the framework disconnects and reconnects, so it is fine.
> For rate limiting, Master:failoverFramework() needs to be changed to update the principal in ‘frameworks.principals’ map and also remove the metrics for the old principal if there are no other frameworks with this principal (similar to what we do in Master::removeFramework()).
> The Master::visit() and Master::_visit() should work with the current semantics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)