You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2017/05/09 01:05:04 UTC

[jira] [Created] (MESOS-7478) Pre-1.2.x master does not work with 1.2.x agent.

Benjamin Mahler created MESOS-7478:
--------------------------------------

             Summary: Pre-1.2.x master does not work with 1.2.x agent.
                 Key: MESOS-7478
                 URL: https://issues.apache.org/jira/browse/MESOS-7478
             Project: Mesos
          Issue Type: Bug
          Components: agent
            Reporter: Benjamin Mahler
            Priority: Blocker


[~evilezh] reported the following crash in the agent upon running a 1.1.0 master against a 1.2.0 agent:

{noformat}
F0509 00:19:07.045413  3469 slave.cpp:4609] Check failed: resource.has_allocation_info() 
*** Check failure stack trace: ***
    @     0x7f4c4a4fa3cd  google::LogMessage::Fail()
    @     0x7f4c4a4fc180  google::LogMessage::SendToLog()
    @     0x7f4c4a4f9fb3  google::LogMessage::Flush()
    @     0x7f4c4a4fcba9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f4c49b3bcf5  mesos::internal::slave::Slave::getExecutorInfo()
    @     0x7f4c49b3cf76  mesos::internal::slave::Slave::runTask()
    @     0x7f4c49b8832c  ProtobufProcess<>::handler4<>()
    @     0x7f4c49b4dc06  std::_Function_handler<>::_M_invoke()
    @     0x7f4c49b6975a  ProtobufProcess<>::visit()
    @     0x7f4c4a46c933  process::ProcessManager::resume()
    @     0x7f4c4a477537  _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
    @     0x7f4c486b8c80  (unknown)
    @     0x7f4c481d46ba  start_thread
    @     0x7f4c47f0a82d  (unknown)
Aborted (core dumped)
{noformat}

This appears to have been due to a lack of manual upgrade testing (we don't have any automated upgrade testing in place).

The check in {{getExecutorInfo(...)}} [here|https://github.com/apache/mesos/blob/1.2.0/src/slave/slave.cpp#L4609] crashes with an old master because it occurs before our injection in {{run(...)}}. See the {{runTask(...)}} call into {{getExecutorInfo(...)}} [here|https://github.com/apache/mesos/blob/1.2.0/src/slave/slave.cpp#L1556].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)