You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by ch...@apache.org on 2018/12/05 22:06:48 UTC

[mesos] branch 1.7.x updated (414805d -> ce10fa8)

This is an automated email from the ASF dual-hosted git repository.

chhsiao pushed a change to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from 414805d  Added MESOS-9411 to 1.7.1 CHANGELOG.
     new 58548af  Renamed a function argument to not reuse member name.
     new d3db86a  Fixed handling for offer operation updates.
     new 3c7c456  Made agent state consistent with forwarded updates.
     new 58c7ff5  Added agent and resource provider IDs to operation status messages.
     new 9fd65a6  Set agent and/or resource provider ID in operation status updates.
     new ce10fa8  Added MESOS-9293 to the 1.7.1 CHANGELOG.

The 6 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG                                  |   1 +
 include/mesos/mesos.proto                  |  12 +++
 include/mesos/v1/mesos.proto               |  12 +++
 src/common/protobuf_utils.cpp              |   4 +-
 src/common/protobuf_utils.hpp              |   4 +-
 src/common/type_utils.cpp                  |  17 ++++
 src/internal/devolve.cpp                   |   8 +-
 src/internal/evolve.cpp                    |   8 +-
 src/master/master.cpp                      |  74 +++++++++++++----
 src/resource_provider/manager.cpp          |   1 +
 src/resource_provider/storage/provider.cpp |  54 +++++++-----
 src/slave/slave.cpp                        |  63 +++++++++-----
 src/tests/master_tests.cpp                 | 129 +++++++++++++++++++++++++++++
 src/tests/mesos.hpp                        |   7 +-
 14 files changed, 328 insertions(+), 66 deletions(-)


[mesos] 03/06: Made agent state consistent with forwarded updates.

Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 3c7c4568b1b1d75d43cdf71871599557c54c7656
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:02:59 2018 -0800

    Made agent state consistent with forwarded updates.
    
    When the agent handles an `UpdateOperationStatusMessage` from a resource
    provider, it injects its own ID which is (at least conceptually) unknown
    to the resource provider before forwarding the message to the master,
    and also updates its own tracking for the operation.
    
    This patch makes sure that we first mutate the message before handing it
    on for updating the internal operation tracking, while previously we
    used the unmodified message. Always using the same message reduces error
    potential if in future changes we e.g., introduce agent operation status
    update managers.
    
    Review: https://reviews.apache.org/r/69458/
---
 src/slave/slave.cpp | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 218ca83..50a4729 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -7787,9 +7787,14 @@ void Slave::handleResourceProviderMessage(
     case ResourceProviderMessage::Type::UPDATE_OPERATION_STATUS: {
       CHECK_SOME(message->updateOperationStatus);
 
-      const UpdateOperationStatusMessage& update =
+      // The status update from the resource provider didn't provide
+      // the agent ID (because the resource provider doesn't know it),
+      // hence we inject it here.
+      UpdateOperationStatusMessage update =
         message->updateOperationStatus->update;
 
+      update.mutable_slave_id()->CopyFrom(info.id());
+
       const UUID& operationUUID = update.operation_uuid();
 
       Operation* operation = getOperation(operationUUID);
@@ -7850,14 +7855,7 @@ void Slave::handleResourceProviderMessage(
                  ? " for framework " + stringify(update.framework_id())
                  : " for an operator API call");
 
-          // The status update from the resource provider didn't
-          // provide the agent ID (because the resource provider
-          // doesn't know it), hence we inject it here.
-          UpdateOperationStatusMessage _update;
-          _update.CopyFrom(update);
-          _update.mutable_slave_id()->CopyFrom(info.id());
-
-          send(master.get(), _update);
+          send(master.get(), update);
           break;
         }
       }


[mesos] 02/06: Fixed handling for offer operation updates.

Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit d3db86a085575189356c2099cc0a0030d724dc35
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:01:30 2018 -0800

    Fixed handling for offer operation updates.
    
    The handling of offer operation updates introduced in `c946615ec6d`
    made use of an update's `latest_status` without making sure that any
    value was set. This could lead to situation where an uninitialized
    enum value was switched on which would have caused a fatal error at
    runtime.
    
    This patch replaces uses of `latest_status` with `state` which does
    contain the information we care about. We also adjust the error
    logging so we log the value that lead to the error, not some other
    value.
    
    Review: https://reviews.apache.org/r/69157/
---
 src/slave/slave.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 7425cfb..218ca83 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -8011,7 +8011,7 @@ void Slave::updateOperation(
     return;
   }
 
-  switch (update.latest_status().state()) {
+  switch (operation->latest_status().state()) {
     // Terminal state, and the conversion is successful.
     case OPERATION_FINISHED: {
       apply(operation);
@@ -8032,8 +8032,8 @@ void Slave::updateOperation(
     case OPERATION_GONE_BY_OPERATOR:
     case OPERATION_RECOVERING:
     case OPERATION_UNKNOWN: {
-      LOG(FATAL) << "Unexpected operation state "
-                 << operation->latest_status().state();
+      LOG(FATAL)
+        << "Unexpected operation state " << operation->latest_status().state();
     }
   }
 }


[mesos] 04/06: Added agent and resource provider IDs to operation status messages.

Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 58c7ff57b70e43da479ef2b4ea7f635c2c4a8b73
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:03:06 2018 -0800

    Added agent and resource provider IDs to operation status messages.
    
    This patch adds agent and resource provider IDs to
    `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
    frameworks are able to reconcile enough information after failover to
    construct operation acknowledgements.
    
    We will add code to populate these fields in a follow-up patch.
    
    Review: https://reviews.apache.org/r/69162/
---
 include/mesos/mesos.proto    | 12 ++++++++++++
 include/mesos/v1/mesos.proto | 12 ++++++++++++
 src/common/type_utils.cpp    | 17 +++++++++++++++++
 src/internal/devolve.cpp     |  8 +++++++-
 src/internal/evolve.cpp      |  8 +++++++-
 5 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/include/mesos/mesos.proto b/include/mesos/mesos.proto
index 2df4dad..043a737 100644
--- a/include/mesos/mesos.proto
+++ b/include/mesos/mesos.proto
@@ -2426,6 +2426,18 @@ message OperationStatus {
   // include a `uuid`. The status is considered delivered once
   // it is acknowledged by the scheduler.
   optional UUID uuid = 5;
+
+  // If the operation affects resources from a local resource provider,
+  // both `slave_id` and `resource_provider_id` will be set.
+  //
+  // If the operation affects resources that belong to an external
+  // resource provider, only `resource_provider_id` will be set.
+  //
+  // In certain cases, e.g., invalid operations, neither `uuid`,
+  // `slave_id` nor `resource_provider_id` will be set, and the
+  // scheduler does not need to acknowledge this status update.
+  optional SlaveID slave_id = 6;
+  optional ResourceProviderID resource_provider_id = 7;
 }
 
 
diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto
index 6a04193..9830c0b 100644
--- a/include/mesos/v1/mesos.proto
+++ b/include/mesos/v1/mesos.proto
@@ -2419,6 +2419,18 @@ message OperationStatus {
   // include a `uuid`. The status is considered delivered once
   // it is acknowledged by the scheduler.
   optional UUID uuid = 5;
+
+  // If the operation affects resources from a local resource provider,
+  // both `agent_id` and `resource_provider_id` will be set.
+  //
+  // If the operation affects resources that belong to an external
+  // resource provider, only `resource_provider_id` will be set.
+  //
+  // In certain cases, e.g., invalid operations, neither `uuid`,
+  // `slave_id` nor `resource_provider_id` will be set, and the
+  // scheduler does not need to acknowledge this status update.
+  optional AgentID agent_id = 6;
+  optional ResourceProviderID resource_provider_id = 7;
 }
 
 
diff --git a/src/common/type_utils.cpp b/src/common/type_utils.cpp
index 33d6380..1f0f4a4 100644
--- a/src/common/type_utils.cpp
+++ b/src/common/type_utils.cpp
@@ -488,6 +488,23 @@ bool operator==(const OperationStatus& left, const OperationStatus& right)
     return false;
   }
 
+  if (left.has_slave_id() != right.has_slave_id()) {
+    return false;
+  }
+
+  if (left.has_slave_id() && left.slave_id() != right.slave_id()) {
+    return false;
+  }
+
+  if (left.has_resource_provider_id() != right.has_resource_provider_id()) {
+    return false;
+  }
+
+  if (left.has_resource_provider_id() &&
+      left.resource_provider_id() != right.resource_provider_id()) {
+    return false;
+  }
+
   return true;
 }
 
diff --git a/src/internal/devolve.cpp b/src/internal/devolve.cpp
index a5d9ab9..3df04dc 100644
--- a/src/internal/devolve.cpp
+++ b/src/internal/devolve.cpp
@@ -106,7 +106,13 @@ Offer devolve(const v1::Offer& offer)
 
 OperationStatus devolve(const v1::OperationStatus& status)
 {
-  return devolve<OperationStatus>(status);
+  OperationStatus _status = devolve<OperationStatus>(status);
+
+  if (status.has_agent_id()) {
+    *_status.mutable_slave_id() = devolve<SlaveID>(status.agent_id());
+  }
+
+  return _status;
 }
 
 
diff --git a/src/internal/evolve.cpp b/src/internal/evolve.cpp
index 5ef5838..defda32 100644
--- a/src/internal/evolve.cpp
+++ b/src/internal/evolve.cpp
@@ -160,7 +160,13 @@ v1::OfferID evolve(const OfferID& offerId)
 
 v1::OperationStatus evolve(const OperationStatus& status)
 {
-  return evolve<v1::OperationStatus>(status);
+  v1::OperationStatus _status = evolve<v1::OperationStatus>(status);
+
+  if (status.has_slave_id()) {
+    *_status.mutable_agent_id() = evolve<v1::AgentID>(status.slave_id());
+  }
+
+  return _status;
 }
 
 


[mesos] 05/06: Set agent and/or resource provider ID in operation status updates.

Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 9fd65a6f8862e52a6d8c9ed970498f4a8db113f7
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:03:09 2018 -0800

    Set agent and/or resource provider ID in operation status updates.
    
    This patch sets agent and/or resource provider ID operation status
    update messages. This is not always possible, e.g., some operations
    might fail validation so that no corresponding IDs can be extracted.
    
    Since operations failing validation are currently directly rejected by
    the master without going through a status update manager, they are not
    retried either. If a master status update manager for operations is
    introduced at a later point it should be possible to forward
    acknowledgements for updates to the master's update manager.
    
    Review: https://reviews.apache.org/r/69163/
---
 src/common/protobuf_utils.cpp              |   4 +-
 src/common/protobuf_utils.hpp              |   4 +-
 src/master/master.cpp                      |  74 +++++++++++++----
 src/resource_provider/manager.cpp          |   1 +
 src/resource_provider/storage/provider.cpp |  48 ++++++-----
 src/slave/slave.cpp                        |  41 +++++++--
 src/tests/master_tests.cpp                 | 129 +++++++++++++++++++++++++++++
 src/tests/mesos.hpp                        |   7 +-
 8 files changed, 259 insertions(+), 49 deletions(-)

diff --git a/src/common/protobuf_utils.cpp b/src/common/protobuf_utils.cpp
index 77139d8..14e582a 100644
--- a/src/common/protobuf_utils.cpp
+++ b/src/common/protobuf_utils.cpp
@@ -426,7 +426,9 @@ OperationStatus createOperationStatus(
     const Option<OperationID>& operationId,
     const Option<string>& message,
     const Option<Resources>& convertedResources,
-    const Option<id::UUID>& uuid)
+    const Option<id::UUID>& uuid,
+    const Option<SlaveID>& slaveId,
+    const Option<ResourceProviderID>& resourceProviderId)
 {
   OperationStatus status;
   status.set_state(state);
diff --git a/src/common/protobuf_utils.hpp b/src/common/protobuf_utils.hpp
index 1662125..ca27e1c 100644
--- a/src/common/protobuf_utils.hpp
+++ b/src/common/protobuf_utils.hpp
@@ -165,7 +165,9 @@ OperationStatus createOperationStatus(
     const Option<OperationID>& operationId = None(),
     const Option<std::string>& message = None(),
     const Option<Resources>& convertedResources = None(),
-    const Option<id::UUID>& statusUUID = None());
+    const Option<id::UUID>& statusUUID = None(),
+    const Option<SlaveID>& slaveId = None(),
+    const Option<ResourceProviderID>& resourceProviderId = None());
 
 
 Operation createOperation(
diff --git a/src/master/master.cpp b/src/master/master.cpp
index 14f0f12..fc29781 100644
--- a/src/master/master.cpp
+++ b/src/master/master.cpp
@@ -2275,6 +2275,11 @@ void Master::drop(
     scheduler::Event update;
     update.set_type(scheduler::Event::UPDATE_OPERATION_STATUS);
 
+    // NOTE: We do not attempt to set the agent or resource provider IDs for
+    // dropped operations as we cannot guarantee to always know their values.
+    //
+    // TODO(bbannier): Set agent or resource provider ID if we know
+    // for certain that the operation was valid.
     *update.mutable_update_operation_status()->mutable_status() =
       protobuf::createOperationStatus(
           OperationState::OPERATION_ERROR,
@@ -9146,6 +9151,11 @@ scheduler::Response::ReconcileOperations Master::reconcileOperations(
       slaveId = operation.slave_id();
     }
 
+    Option<ResourceProviderID> resourceProviderId = None();
+    if (operation.has_resource_provider_id()) {
+      resourceProviderId = operation.resource_provider_id();
+    }
+
     Option<Operation*> frameworkOperation =
       framework->getOperation(operation.operation_id());
 
@@ -9163,38 +9173,62 @@ scheduler::Response::ReconcileOperations Master::reconcileOperations(
       *status = protobuf::createOperationStatus(
           OperationState::OPERATION_RECOVERING,
           operation.operation_id(),
-          "Reconciliation: Agent is recovered but has not re-registered");
+          "Reconciliation: Agent is recovered but has not re-registered",
+          None(),
+          None(),
+          slaveId,
+          resourceProviderId);
     } else if (slaveId.isSome() && slaves.registered.contains(slaveId.get())) {
       // (3) Operation is unknown, slave is registered: OPERATION_UNKNOWN.
       *status = protobuf::createOperationStatus(
           OperationState::OPERATION_UNKNOWN,
           operation.operation_id(),
-          "Reconciliation: Operation is unknown");
+          "Reconciliation: Operation is unknown",
+          None(),
+          None(),
+          slaveId,
+          resourceProviderId);
     } else if (slaveId.isSome() && slaves.unreachable.contains(slaveId.get())) {
       // (4) Operation is unknown, slave is unreachable: OPERATION_UNREACHABLE.
       *status = protobuf::createOperationStatus(
           OperationState::OPERATION_UNREACHABLE,
           operation.operation_id(),
-          "Reconciliation: Agent is unreachable");
+          "Reconciliation: Agent is unreachable",
+          None(),
+          None(),
+          slaveId,
+          resourceProviderId);
     } else if (slaveId.isSome() && slaves.gone.contains(slaveId.get())) {
       // (5) Operation is unknown, slave is gone: OPERATION_GONE_BY_OPERATOR.
       *status = protobuf::createOperationStatus(
           OperationState::OPERATION_GONE_BY_OPERATOR,
           operation.operation_id(),
-          "Reconciliation: Agent marked gone by operator");
+          "Reconciliation: Agent marked gone by operator",
+          None(),
+          None(),
+          slaveId,
+          resourceProviderId);
     } else if (slaveId.isSome()) {
       // (6) Operation is unknown, slave is unknown: OPERATION_UNKNOWN.
       *status = protobuf::createOperationStatus(
           OperationState::OPERATION_UNKNOWN,
           operation.operation_id(),
-          "Reconciliation: Both operation and agent are unknown");
+          "Reconciliation: Both operation and agent are unknown",
+          None(),
+          None(),
+          slaveId,
+          resourceProviderId);
     } else {
       // (7) Operation is unknown, slave is unknown: OPERATION_UNKNOWN.
       *status = protobuf::createOperationStatus(
           OperationState::OPERATION_UNKNOWN,
           operation.operation_id(),
           "Reconciliation: Operation is unknown and no 'agent_id' was"
-          " provided");
+          " provided",
+          None(),
+          None(),
+          slaveId,
+          resourceProviderId);
     }
   }
 
@@ -11251,18 +11285,22 @@ void Master::_apply(
       ? slave->resourceVersion.get()
       : slave->resourceProviders.get(resourceProviderId.get())->resourceVersion;
 
-    Operation* operation = new Operation(
-        protobuf::createOperation(
-            operationInfo,
-            protobuf::createOperationStatus(
-              OPERATION_PENDING,
-              operationInfo.has_id()
-                ? operationInfo.id()
-                : Option<OperationID>::none()),
-            framework != nullptr
-              ? framework->id()
-              : Option<FrameworkID>::none(),
-            slave->id));
+    Operation* operation = new Operation(protobuf::createOperation(
+        operationInfo,
+        protobuf::createOperationStatus(
+            OPERATION_PENDING,
+            operationInfo.has_id()
+              ? operationInfo.id()
+              : Option<OperationID>::none(),
+            None(),
+            None(),
+            None(),
+            slave->id,
+            resourceProviderId.isSome()
+              ? Some(resourceProviderId.get())
+              : Option<ResourceProviderID>::none()),
+        framework != nullptr ? framework->id() : Option<FrameworkID>::none(),
+        slave->id));
 
     addOperation(framework, slave, operation);
 
diff --git a/src/resource_provider/manager.cpp b/src/resource_provider/manager.cpp
index abd7e38..bb72a5c 100644
--- a/src/resource_provider/manager.cpp
+++ b/src/resource_provider/manager.cpp
@@ -572,6 +572,7 @@ void ResourceProviderManagerProcess::reconcileOperations(
     if (operation.has_resource_provider_id()) {
       if (!resourceProviders.subscribed.contains(
               operation.resource_provider_id())) {
+        // TODO(bbannier): We should send `OPERATION_UNREACHABLE` here.
         LOG(WARNING) << "Dropping operation reconciliation message with"
                      << " operation_uuid " << operation.operation_uuid()
                      << " because resource provider "
diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp
index e931300..93df4a2 100644
--- a/src/resource_provider/storage/provider.cpp
+++ b/src/resource_provider/storage/provider.cpp
@@ -1572,7 +1572,12 @@ void StorageLocalResourceProviderProcess::applyOperation(
       protobuf::createOperationStatus(
           OPERATION_PENDING,
           operation.info().has_id()
-            ? operation.info().id() : Option<OperationID>::none()),
+            ? operation.info().id() : Option<OperationID>::none(),
+          None(),
+          None(),
+          None(),
+          slaveId,
+          info.id()),
       frameworkId,
       slaveId,
       protobuf::createUUID(uuid.get()));
@@ -3072,17 +3077,19 @@ void StorageLocalResourceProviderProcess::dropOperation(
 
   UpdateOperationStatusMessage update =
     protobuf::createUpdateOperationStatusMessage(
-       protobuf::createUUID(operationUuid),
-       protobuf::createOperationStatus(
-           OPERATION_DROPPED,
-           operation.isSome() && operation->has_id()
-             ? operation->id() : Option<OperationID>::none(),
-           message,
-           None(),
-           id::UUID::random()),
-       None(),
-       frameworkId,
-       slaveId);
+        protobuf::createUUID(operationUuid),
+        protobuf::createOperationStatus(
+            OPERATION_DROPPED,
+            operation.isSome() && operation->has_id()
+              ? operation->id() : Option<OperationID>::none(),
+            message,
+            None(),
+            id::UUID::random(),
+            slaveId,
+            info.id()),
+        None(),
+        frameworkId,
+        slaveId);
 
   auto die = [=](const string& message) {
     LOG(ERROR)
@@ -3318,14 +3325,15 @@ Try<Nothing> StorageLocalResourceProviderProcess::updateOperationStatus(
     error = conversions.error();
   }
 
-  operation.mutable_latest_status()->CopyFrom(
-      protobuf::createOperationStatus(
-          error.isNone() ? OPERATION_FINISHED : OPERATION_FAILED,
-          operation.info().has_id()
-            ? operation.info().id() : Option<OperationID>::none(),
-          error.isNone() ? Option<string>::none() : error->message,
-          error.isNone() ? convertedResources : Option<Resources>::none(),
-          id::UUID::random()));
+  operation.mutable_latest_status()->CopyFrom(protobuf::createOperationStatus(
+      error.isNone() ? OPERATION_FINISHED : OPERATION_FAILED,
+      operation.info().has_id()
+        ? operation.info().id() : Option<OperationID>::none(),
+      error.isNone() ? Option<string>::none() : error->message,
+      error.isNone() ? convertedResources : Option<Resources>::none(),
+      id::UUID::random(),
+      slaveId,
+      info.id()));
 
   operation.add_statuses()->CopyFrom(operation.latest_status());
 
diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 50a4729..1d5dad5 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -4364,13 +4364,20 @@ void Slave::applyOperation(const ApplyOperationMessage& message)
     return;
   }
 
-  Operation* operation = new Operation(
-      protobuf::createOperation(
-          message.operation_info(),
-          protobuf::createOperationStatus(OPERATION_PENDING, operationId),
-          frameworkId,
+  Operation* operation = new Operation(protobuf::createOperation(
+      message.operation_info(),
+      protobuf::createOperationStatus(
+          OPERATION_PENDING,
+          operationId,
+          None(),
+          None(),
+          None(),
           info.id(),
-          uuid));
+          resourceProviderId.isSome()
+            ? resourceProviderId.get() : Option<ResourceProviderID>::none()),
+      frameworkId,
+      info.id(),
+      uuid));
 
   addOperation(operation);
 
@@ -4401,7 +4408,15 @@ void Slave::applyOperation(const ApplyOperationMessage& message)
   UpdateOperationStatusMessage update =
     protobuf::createUpdateOperationStatusMessage(
         uuid,
-        protobuf::createOperationStatus(OPERATION_FINISHED, operationId),
+        protobuf::createOperationStatus(
+            OPERATION_FINISHED,
+            operationId,
+            None(),
+            None(),
+            None(),
+            info.id(),
+            resourceProviderId.isSome()
+              ? resourceProviderId.get() : Option<ResourceProviderID>::none()),
         None(),
         frameworkId,
         info.id());
@@ -4442,7 +4457,13 @@ void Slave::reconcileOperations(const ReconcileOperationsMessage& message)
       UpdateOperationStatusMessage update =
         protobuf::createUpdateOperationStatusMessage(
             operation.operation_uuid(),
-            protobuf::createOperationStatus(OPERATION_DROPPED),
+            protobuf::createOperationStatus(
+                OPERATION_DROPPED,
+                None(),
+                None(),
+                None(),
+                None(),
+                info.id()),
             None(),
             None(),
             info.id());
@@ -7794,6 +7815,10 @@ void Slave::handleResourceProviderMessage(
         message->updateOperationStatus->update;
 
       update.mutable_slave_id()->CopyFrom(info.id());
+      update.mutable_status()->mutable_slave_id()->CopyFrom(info.id());
+      if (update.has_latest_status()) {
+        update.mutable_latest_status()->mutable_slave_id()->CopyFrom(info.id());
+      }
 
       const UUID& operationUUID = update.operation_uuid();
 
diff --git a/src/tests/master_tests.cpp b/src/tests/master_tests.cpp
index 9d5d5a3..4467799 100644
--- a/src/tests/master_tests.cpp
+++ b/src/tests/master_tests.cpp
@@ -9030,6 +9030,135 @@ TEST_F(MasterTest, OperationUpdateDuringFailover)
 }
 
 
+// This test verifies that operation status updates contain the
+// agent ID and resource provider ID of originating providers.
+TEST_F(MasterTest, OperationUpdateResourceProvider)
+{
+  Clock::pause();
+
+  master::Flags masterFlags = CreateMasterFlags();
+
+  Try<Owned<cluster::Master>> master = StartMaster(masterFlags);
+  ASSERT_SOME(master);
+
+  slave::Flags slaveFlags = CreateSlaveFlags();
+
+  Future<UpdateSlaveMessage> updateSlaveMessage =
+    FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _);
+
+  StandaloneMasterDetector detector(master.get()->pid);
+  Try<Owned<cluster::Slave>> slave = StartSlave(&detector, slaveFlags);
+  ASSERT_SOME(slave);
+
+  Clock::advance(slaveFlags.registration_backoff_factor);
+
+  AWAIT_READY(updateSlaveMessage);
+
+  // Register a resource provider with the agent.
+  mesos::v1::ResourceProviderInfo resourceProviderInfo;
+  resourceProviderInfo.set_type("org.apache.mesos.resource_provider.test");
+  resourceProviderInfo.set_name("test");
+
+  v1::Resources resourceProviderResources = v1::createDiskResource(
+      "200", "*", None(), None(), v1::createDiskSourceRaw(None(), "profile"));
+
+  v1::MockResourceProvider resourceProvider(
+      resourceProviderInfo, resourceProviderResources);
+
+  Owned<EndpointDetector> endpointDetector(
+      resource_provider::createEndpointDetector(slave.get()->pid));
+
+  updateSlaveMessage = FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _);
+
+  resourceProvider.start(endpointDetector, ContentType::PROTOBUF);
+
+  AWAIT_READY(updateSlaveMessage);
+
+  const v1::AgentID agentId = evolve(updateSlaveMessage->slave_id());
+
+  ASSERT_TRUE(resourceProvider.info.has_id());
+  const v1::ResourceProviderID resourceProviderId = resourceProvider.info.id();
+
+  // Start a framework to operate on offers.
+  auto scheduler = std::make_shared<v1::MockHTTPScheduler>();
+
+  EXPECT_CALL(*scheduler, connected(_))
+    .WillOnce(v1::scheduler::SendSubscribe(v1::DEFAULT_FRAMEWORK_INFO));
+
+  v1::scheduler::TestMesos driver(
+      master.get()->pid, ContentType::PROTOBUF, scheduler);
+
+  Future<Event::Subscribed> subscribed;
+  EXPECT_CALL(*scheduler, subscribed(_, _))
+    .WillOnce(FutureArg<1>(&subscribed));
+
+  EXPECT_CALL(*scheduler, heartbeat(_))
+    .WillRepeatedly(Return()); // Ignore heartbeats.
+
+  Future<Event::Offers> offers;
+  EXPECT_CALL(*scheduler, offers(_, _))
+    .WillOnce(FutureArg<1>(&offers));
+
+  AWAIT_READY(subscribed);
+
+  const v1::FrameworkID& frameworkId = subscribed->framework_id();
+
+  Clock::settle();
+  Clock::advance(masterFlags.allocation_interval);
+
+  AWAIT_READY(offers);
+
+  Future<Event::UpdateOperationStatus> updateOperationStatus;
+  EXPECT_CALL(*scheduler, updateOperationStatus(_, _))
+    .WillOnce(FutureArg<1>(&updateOperationStatus));
+
+  ASSERT_FALSE(offers->offers().empty());
+  const v1::Offer& offer = offers->offers(0);
+
+  // Perform an operation against the resource provider resources.
+  Option<v1::Resource> resource;
+  foreach (const v1::Resource& resource_, offer.resources()) {
+    if (resource_.has_provider_id()) {
+      resource = resource_;
+      break;
+    }
+  }
+
+  ASSERT_SOME(resource);
+
+  {
+    Call call;
+    call.mutable_framework_id()->CopyFrom(frameworkId);
+    call.set_type(Call::ACCEPT);
+
+    Call::Accept* accept = call.mutable_accept();
+    accept->add_offer_ids()->CopyFrom(offer.id());
+
+    v1::Offer::Operation* operation = accept->add_operations();
+    operation->set_type(v1::Offer::Operation::CREATE_DISK);
+    operation->mutable_id()->set_value("create_disk");
+
+    v1::Offer::Operation::CreateDisk* createDisk =
+      operation->mutable_create_disk();
+    createDisk->mutable_source()->CopyFrom(resource.get());
+    createDisk->set_target_type(v1::Resource::DiskInfo::Source::MOUNT);
+
+    driver.send(call);
+  }
+
+  AWAIT_READY(updateOperationStatus);
+
+  const v1::OperationStatus& status = updateOperationStatus->status();
+  ASSERT_EQ("create_disk", status.operation_id().value());
+
+  ASSERT_TRUE(status.has_agent_id());
+  EXPECT_EQ(agentId, status.agent_id());
+
+  ASSERT_TRUE(status.has_resource_provider_id());
+  EXPECT_EQ(resourceProviderId, status.resource_provider_id());
+}
+
+
 // Tests that the master correctly drops an operation if the operation's 'id'
 // field is set and the operation affects resources not managed by a resource
 // provider.
diff --git a/src/tests/mesos.hpp b/src/tests/mesos.hpp
index e4a1ab4..60bd6e0 100644
--- a/src/tests/mesos.hpp
+++ b/src/tests/mesos.hpp
@@ -3203,7 +3203,12 @@ public:
         break;
     }
 
-    update->mutable_latest_status()->CopyFrom(update->status());
+    if (update->has_status()) {
+      update->mutable_status()->mutable_resource_provider_id()->CopyFrom(
+          info.id());
+
+      update->mutable_latest_status()->CopyFrom(update->status());
+    }
 
     driver->send(call);
   }


[mesos] 01/06: Renamed a function argument to not reuse member name.

Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 58548afa35ae8a4e8cf1efc3652712788d04fdf9
Author: Benjamin Bannier <bb...@apache.org>
AuthorDate: Wed Oct 24 09:59:03 2018 +0200

    Renamed a function argument to not reuse member name.
    
    While a function argument shadowing a member variable is perfectly
    legal, it is still confusing. In this patch we rename the function
    argument to remove one case of such shadowing.
    
    Review: https://reviews.apache.org/r/69161
---
 src/resource_provider/storage/provider.cpp | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp
index 2d8d08d..e931300 100644
--- a/src/resource_provider/storage/provider.cpp
+++ b/src/resource_provider/storage/provider.cpp
@@ -414,7 +414,7 @@ private:
   void dropOperation(
       const id::UUID& operationUuid,
       const Option<FrameworkID>& frameworkId,
-      const Option<Offer::Operation>& info,
+      const Option<Offer::Operation>& operation,
       const string& message);
 
   Future<vector<ResourceConversion>> applyCreateDisk(
@@ -3064,7 +3064,7 @@ Future<Nothing> StorageLocalResourceProviderProcess::_applyOperation(
 void StorageLocalResourceProviderProcess::dropOperation(
     const id::UUID& operationUuid,
     const Option<FrameworkID>& frameworkId,
-    const Option<Offer::Operation>& info,
+    const Option<Offer::Operation>& operation,
     const string& message)
 {
   LOG(WARNING)
@@ -3075,8 +3075,8 @@ void StorageLocalResourceProviderProcess::dropOperation(
        protobuf::createUUID(operationUuid),
        protobuf::createOperationStatus(
            OPERATION_DROPPED,
-           info.isSome() && info->has_id()
-             ? info->id() : Option<OperationID>::none(),
+           operation.isSome() && operation->has_id()
+             ? operation->id() : Option<OperationID>::none(),
            message,
            None(),
            id::UUID::random()),
@@ -3096,7 +3096,7 @@ void StorageLocalResourceProviderProcess::dropOperation(
     .onDiscarded(defer(self(), std::bind(die, "future discarded")));
 
   ++metrics.operations_dropped.at(
-      info.isSome() ? info->type() : Offer::Operation::UNKNOWN);
+      operation.isSome() ? operation->type() : Offer::Operation::UNKNOWN);
 }
 
 


[mesos] 06/06: Added MESOS-9293 to the 1.7.1 CHANGELOG.

Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit ce10fa8ab525054d4152d0273972bc7b2b5cb782
Author: Chun-Hung Hsiao <ch...@mesosphere.io>
AuthorDate: Wed Dec 5 13:10:13 2018 -0800

    Added MESOS-9293 to the 1.7.1 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 7996541..d530cb7 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -22,6 +22,7 @@ Release Notes - Mesos - Version 1.7.1 (WIP)
   * [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
   * [MESOS-9281] - SLRP gets a stale checkpoint after system crash.
   * [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
+  * [MESOS-9293] - If a framework looses operation information it cannot reconcile to acknowledge updates.
   * [MESOS-9295] - Nested container launch could fail if the agent upgrade with new cgroup subsystems.
   * [MESOS-9308] - URI disk profile adaptor could deadlock.
   * [MESOS-9317] - Some master endpoints do not handle failed authorization properly.