You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by ch...@apache.org on 2018/12/05 22:06:48 UTC
[mesos] branch 1.7.x updated (414805d -> ce10fa8)
This is an automated email from the ASF dual-hosted git repository.
chhsiao pushed a change to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.
from 414805d Added MESOS-9411 to 1.7.1 CHANGELOG.
new 58548af Renamed a function argument to not reuse member name.
new d3db86a Fixed handling for offer operation updates.
new 3c7c456 Made agent state consistent with forwarded updates.
new 58c7ff5 Added agent and resource provider IDs to operation status messages.
new 9fd65a6 Set agent and/or resource provider ID in operation status updates.
new ce10fa8 Added MESOS-9293 to the 1.7.1 CHANGELOG.
The 6 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
CHANGELOG | 1 +
include/mesos/mesos.proto | 12 +++
include/mesos/v1/mesos.proto | 12 +++
src/common/protobuf_utils.cpp | 4 +-
src/common/protobuf_utils.hpp | 4 +-
src/common/type_utils.cpp | 17 ++++
src/internal/devolve.cpp | 8 +-
src/internal/evolve.cpp | 8 +-
src/master/master.cpp | 74 +++++++++++++----
src/resource_provider/manager.cpp | 1 +
src/resource_provider/storage/provider.cpp | 54 +++++++-----
src/slave/slave.cpp | 63 +++++++++-----
src/tests/master_tests.cpp | 129 +++++++++++++++++++++++++++++
src/tests/mesos.hpp | 7 +-
14 files changed, 328 insertions(+), 66 deletions(-)
[mesos] 03/06: Made agent state consistent with forwarded updates.
Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git
commit 3c7c4568b1b1d75d43cdf71871599557c54c7656
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:02:59 2018 -0800
Made agent state consistent with forwarded updates.
When the agent handles an `UpdateOperationStatusMessage` from a resource
provider, it injects its own ID which is (at least conceptually) unknown
to the resource provider before forwarding the message to the master,
and also updates its own tracking for the operation.
This patch makes sure that we first mutate the message before handing it
on for updating the internal operation tracking, while previously we
used the unmodified message. Always using the same message reduces error
potential if in future changes we e.g., introduce agent operation status
update managers.
Review: https://reviews.apache.org/r/69458/
---
src/slave/slave.cpp | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 218ca83..50a4729 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -7787,9 +7787,14 @@ void Slave::handleResourceProviderMessage(
case ResourceProviderMessage::Type::UPDATE_OPERATION_STATUS: {
CHECK_SOME(message->updateOperationStatus);
- const UpdateOperationStatusMessage& update =
+ // The status update from the resource provider didn't provide
+ // the agent ID (because the resource provider doesn't know it),
+ // hence we inject it here.
+ UpdateOperationStatusMessage update =
message->updateOperationStatus->update;
+ update.mutable_slave_id()->CopyFrom(info.id());
+
const UUID& operationUUID = update.operation_uuid();
Operation* operation = getOperation(operationUUID);
@@ -7850,14 +7855,7 @@ void Slave::handleResourceProviderMessage(
? " for framework " + stringify(update.framework_id())
: " for an operator API call");
- // The status update from the resource provider didn't
- // provide the agent ID (because the resource provider
- // doesn't know it), hence we inject it here.
- UpdateOperationStatusMessage _update;
- _update.CopyFrom(update);
- _update.mutable_slave_id()->CopyFrom(info.id());
-
- send(master.get(), _update);
+ send(master.get(), update);
break;
}
}
[mesos] 02/06: Fixed handling for offer operation updates.
Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git
commit d3db86a085575189356c2099cc0a0030d724dc35
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:01:30 2018 -0800
Fixed handling for offer operation updates.
The handling of offer operation updates introduced in `c946615ec6d`
made use of an update's `latest_status` without making sure that any
value was set. This could lead to situation where an uninitialized
enum value was switched on which would have caused a fatal error at
runtime.
This patch replaces uses of `latest_status` with `state` which does
contain the information we care about. We also adjust the error
logging so we log the value that lead to the error, not some other
value.
Review: https://reviews.apache.org/r/69157/
---
src/slave/slave.cpp | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 7425cfb..218ca83 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -8011,7 +8011,7 @@ void Slave::updateOperation(
return;
}
- switch (update.latest_status().state()) {
+ switch (operation->latest_status().state()) {
// Terminal state, and the conversion is successful.
case OPERATION_FINISHED: {
apply(operation);
@@ -8032,8 +8032,8 @@ void Slave::updateOperation(
case OPERATION_GONE_BY_OPERATOR:
case OPERATION_RECOVERING:
case OPERATION_UNKNOWN: {
- LOG(FATAL) << "Unexpected operation state "
- << operation->latest_status().state();
+ LOG(FATAL)
+ << "Unexpected operation state " << operation->latest_status().state();
}
}
}
[mesos] 04/06: Added agent and resource provider IDs to operation
status messages.
Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git
commit 58c7ff57b70e43da479ef2b4ea7f635c2c4a8b73
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:03:06 2018 -0800
Added agent and resource provider IDs to operation status messages.
This patch adds agent and resource provider IDs to
`UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
frameworks are able to reconcile enough information after failover to
construct operation acknowledgements.
We will add code to populate these fields in a follow-up patch.
Review: https://reviews.apache.org/r/69162/
---
include/mesos/mesos.proto | 12 ++++++++++++
include/mesos/v1/mesos.proto | 12 ++++++++++++
src/common/type_utils.cpp | 17 +++++++++++++++++
src/internal/devolve.cpp | 8 +++++++-
src/internal/evolve.cpp | 8 +++++++-
5 files changed, 55 insertions(+), 2 deletions(-)
diff --git a/include/mesos/mesos.proto b/include/mesos/mesos.proto
index 2df4dad..043a737 100644
--- a/include/mesos/mesos.proto
+++ b/include/mesos/mesos.proto
@@ -2426,6 +2426,18 @@ message OperationStatus {
// include a `uuid`. The status is considered delivered once
// it is acknowledged by the scheduler.
optional UUID uuid = 5;
+
+ // If the operation affects resources from a local resource provider,
+ // both `slave_id` and `resource_provider_id` will be set.
+ //
+ // If the operation affects resources that belong to an external
+ // resource provider, only `resource_provider_id` will be set.
+ //
+ // In certain cases, e.g., invalid operations, neither `uuid`,
+ // `slave_id` nor `resource_provider_id` will be set, and the
+ // scheduler does not need to acknowledge this status update.
+ optional SlaveID slave_id = 6;
+ optional ResourceProviderID resource_provider_id = 7;
}
diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto
index 6a04193..9830c0b 100644
--- a/include/mesos/v1/mesos.proto
+++ b/include/mesos/v1/mesos.proto
@@ -2419,6 +2419,18 @@ message OperationStatus {
// include a `uuid`. The status is considered delivered once
// it is acknowledged by the scheduler.
optional UUID uuid = 5;
+
+ // If the operation affects resources from a local resource provider,
+ // both `agent_id` and `resource_provider_id` will be set.
+ //
+ // If the operation affects resources that belong to an external
+ // resource provider, only `resource_provider_id` will be set.
+ //
+ // In certain cases, e.g., invalid operations, neither `uuid`,
+ // `slave_id` nor `resource_provider_id` will be set, and the
+ // scheduler does not need to acknowledge this status update.
+ optional AgentID agent_id = 6;
+ optional ResourceProviderID resource_provider_id = 7;
}
diff --git a/src/common/type_utils.cpp b/src/common/type_utils.cpp
index 33d6380..1f0f4a4 100644
--- a/src/common/type_utils.cpp
+++ b/src/common/type_utils.cpp
@@ -488,6 +488,23 @@ bool operator==(const OperationStatus& left, const OperationStatus& right)
return false;
}
+ if (left.has_slave_id() != right.has_slave_id()) {
+ return false;
+ }
+
+ if (left.has_slave_id() && left.slave_id() != right.slave_id()) {
+ return false;
+ }
+
+ if (left.has_resource_provider_id() != right.has_resource_provider_id()) {
+ return false;
+ }
+
+ if (left.has_resource_provider_id() &&
+ left.resource_provider_id() != right.resource_provider_id()) {
+ return false;
+ }
+
return true;
}
diff --git a/src/internal/devolve.cpp b/src/internal/devolve.cpp
index a5d9ab9..3df04dc 100644
--- a/src/internal/devolve.cpp
+++ b/src/internal/devolve.cpp
@@ -106,7 +106,13 @@ Offer devolve(const v1::Offer& offer)
OperationStatus devolve(const v1::OperationStatus& status)
{
- return devolve<OperationStatus>(status);
+ OperationStatus _status = devolve<OperationStatus>(status);
+
+ if (status.has_agent_id()) {
+ *_status.mutable_slave_id() = devolve<SlaveID>(status.agent_id());
+ }
+
+ return _status;
}
diff --git a/src/internal/evolve.cpp b/src/internal/evolve.cpp
index 5ef5838..defda32 100644
--- a/src/internal/evolve.cpp
+++ b/src/internal/evolve.cpp
@@ -160,7 +160,13 @@ v1::OfferID evolve(const OfferID& offerId)
v1::OperationStatus evolve(const OperationStatus& status)
{
- return evolve<v1::OperationStatus>(status);
+ v1::OperationStatus _status = evolve<v1::OperationStatus>(status);
+
+ if (status.has_slave_id()) {
+ *_status.mutable_agent_id() = evolve<v1::AgentID>(status.slave_id());
+ }
+
+ return _status;
}
[mesos] 05/06: Set agent and/or resource provider ID in operation
status updates.
Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git
commit 9fd65a6f8862e52a6d8c9ed970498f4a8db113f7
Author: Benjamin Bannier <be...@mesosphere.io>
AuthorDate: Wed Dec 5 13:03:09 2018 -0800
Set agent and/or resource provider ID in operation status updates.
This patch sets agent and/or resource provider ID operation status
update messages. This is not always possible, e.g., some operations
might fail validation so that no corresponding IDs can be extracted.
Since operations failing validation are currently directly rejected by
the master without going through a status update manager, they are not
retried either. If a master status update manager for operations is
introduced at a later point it should be possible to forward
acknowledgements for updates to the master's update manager.
Review: https://reviews.apache.org/r/69163/
---
src/common/protobuf_utils.cpp | 4 +-
src/common/protobuf_utils.hpp | 4 +-
src/master/master.cpp | 74 +++++++++++++----
src/resource_provider/manager.cpp | 1 +
src/resource_provider/storage/provider.cpp | 48 ++++++-----
src/slave/slave.cpp | 41 +++++++--
src/tests/master_tests.cpp | 129 +++++++++++++++++++++++++++++
src/tests/mesos.hpp | 7 +-
8 files changed, 259 insertions(+), 49 deletions(-)
diff --git a/src/common/protobuf_utils.cpp b/src/common/protobuf_utils.cpp
index 77139d8..14e582a 100644
--- a/src/common/protobuf_utils.cpp
+++ b/src/common/protobuf_utils.cpp
@@ -426,7 +426,9 @@ OperationStatus createOperationStatus(
const Option<OperationID>& operationId,
const Option<string>& message,
const Option<Resources>& convertedResources,
- const Option<id::UUID>& uuid)
+ const Option<id::UUID>& uuid,
+ const Option<SlaveID>& slaveId,
+ const Option<ResourceProviderID>& resourceProviderId)
{
OperationStatus status;
status.set_state(state);
diff --git a/src/common/protobuf_utils.hpp b/src/common/protobuf_utils.hpp
index 1662125..ca27e1c 100644
--- a/src/common/protobuf_utils.hpp
+++ b/src/common/protobuf_utils.hpp
@@ -165,7 +165,9 @@ OperationStatus createOperationStatus(
const Option<OperationID>& operationId = None(),
const Option<std::string>& message = None(),
const Option<Resources>& convertedResources = None(),
- const Option<id::UUID>& statusUUID = None());
+ const Option<id::UUID>& statusUUID = None(),
+ const Option<SlaveID>& slaveId = None(),
+ const Option<ResourceProviderID>& resourceProviderId = None());
Operation createOperation(
diff --git a/src/master/master.cpp b/src/master/master.cpp
index 14f0f12..fc29781 100644
--- a/src/master/master.cpp
+++ b/src/master/master.cpp
@@ -2275,6 +2275,11 @@ void Master::drop(
scheduler::Event update;
update.set_type(scheduler::Event::UPDATE_OPERATION_STATUS);
+ // NOTE: We do not attempt to set the agent or resource provider IDs for
+ // dropped operations as we cannot guarantee to always know their values.
+ //
+ // TODO(bbannier): Set agent or resource provider ID if we know
+ // for certain that the operation was valid.
*update.mutable_update_operation_status()->mutable_status() =
protobuf::createOperationStatus(
OperationState::OPERATION_ERROR,
@@ -9146,6 +9151,11 @@ scheduler::Response::ReconcileOperations Master::reconcileOperations(
slaveId = operation.slave_id();
}
+ Option<ResourceProviderID> resourceProviderId = None();
+ if (operation.has_resource_provider_id()) {
+ resourceProviderId = operation.resource_provider_id();
+ }
+
Option<Operation*> frameworkOperation =
framework->getOperation(operation.operation_id());
@@ -9163,38 +9173,62 @@ scheduler::Response::ReconcileOperations Master::reconcileOperations(
*status = protobuf::createOperationStatus(
OperationState::OPERATION_RECOVERING,
operation.operation_id(),
- "Reconciliation: Agent is recovered but has not re-registered");
+ "Reconciliation: Agent is recovered but has not re-registered",
+ None(),
+ None(),
+ slaveId,
+ resourceProviderId);
} else if (slaveId.isSome() && slaves.registered.contains(slaveId.get())) {
// (3) Operation is unknown, slave is registered: OPERATION_UNKNOWN.
*status = protobuf::createOperationStatus(
OperationState::OPERATION_UNKNOWN,
operation.operation_id(),
- "Reconciliation: Operation is unknown");
+ "Reconciliation: Operation is unknown",
+ None(),
+ None(),
+ slaveId,
+ resourceProviderId);
} else if (slaveId.isSome() && slaves.unreachable.contains(slaveId.get())) {
// (4) Operation is unknown, slave is unreachable: OPERATION_UNREACHABLE.
*status = protobuf::createOperationStatus(
OperationState::OPERATION_UNREACHABLE,
operation.operation_id(),
- "Reconciliation: Agent is unreachable");
+ "Reconciliation: Agent is unreachable",
+ None(),
+ None(),
+ slaveId,
+ resourceProviderId);
} else if (slaveId.isSome() && slaves.gone.contains(slaveId.get())) {
// (5) Operation is unknown, slave is gone: OPERATION_GONE_BY_OPERATOR.
*status = protobuf::createOperationStatus(
OperationState::OPERATION_GONE_BY_OPERATOR,
operation.operation_id(),
- "Reconciliation: Agent marked gone by operator");
+ "Reconciliation: Agent marked gone by operator",
+ None(),
+ None(),
+ slaveId,
+ resourceProviderId);
} else if (slaveId.isSome()) {
// (6) Operation is unknown, slave is unknown: OPERATION_UNKNOWN.
*status = protobuf::createOperationStatus(
OperationState::OPERATION_UNKNOWN,
operation.operation_id(),
- "Reconciliation: Both operation and agent are unknown");
+ "Reconciliation: Both operation and agent are unknown",
+ None(),
+ None(),
+ slaveId,
+ resourceProviderId);
} else {
// (7) Operation is unknown, slave is unknown: OPERATION_UNKNOWN.
*status = protobuf::createOperationStatus(
OperationState::OPERATION_UNKNOWN,
operation.operation_id(),
"Reconciliation: Operation is unknown and no 'agent_id' was"
- " provided");
+ " provided",
+ None(),
+ None(),
+ slaveId,
+ resourceProviderId);
}
}
@@ -11251,18 +11285,22 @@ void Master::_apply(
? slave->resourceVersion.get()
: slave->resourceProviders.get(resourceProviderId.get())->resourceVersion;
- Operation* operation = new Operation(
- protobuf::createOperation(
- operationInfo,
- protobuf::createOperationStatus(
- OPERATION_PENDING,
- operationInfo.has_id()
- ? operationInfo.id()
- : Option<OperationID>::none()),
- framework != nullptr
- ? framework->id()
- : Option<FrameworkID>::none(),
- slave->id));
+ Operation* operation = new Operation(protobuf::createOperation(
+ operationInfo,
+ protobuf::createOperationStatus(
+ OPERATION_PENDING,
+ operationInfo.has_id()
+ ? operationInfo.id()
+ : Option<OperationID>::none(),
+ None(),
+ None(),
+ None(),
+ slave->id,
+ resourceProviderId.isSome()
+ ? Some(resourceProviderId.get())
+ : Option<ResourceProviderID>::none()),
+ framework != nullptr ? framework->id() : Option<FrameworkID>::none(),
+ slave->id));
addOperation(framework, slave, operation);
diff --git a/src/resource_provider/manager.cpp b/src/resource_provider/manager.cpp
index abd7e38..bb72a5c 100644
--- a/src/resource_provider/manager.cpp
+++ b/src/resource_provider/manager.cpp
@@ -572,6 +572,7 @@ void ResourceProviderManagerProcess::reconcileOperations(
if (operation.has_resource_provider_id()) {
if (!resourceProviders.subscribed.contains(
operation.resource_provider_id())) {
+ // TODO(bbannier): We should send `OPERATION_UNREACHABLE` here.
LOG(WARNING) << "Dropping operation reconciliation message with"
<< " operation_uuid " << operation.operation_uuid()
<< " because resource provider "
diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp
index e931300..93df4a2 100644
--- a/src/resource_provider/storage/provider.cpp
+++ b/src/resource_provider/storage/provider.cpp
@@ -1572,7 +1572,12 @@ void StorageLocalResourceProviderProcess::applyOperation(
protobuf::createOperationStatus(
OPERATION_PENDING,
operation.info().has_id()
- ? operation.info().id() : Option<OperationID>::none()),
+ ? operation.info().id() : Option<OperationID>::none(),
+ None(),
+ None(),
+ None(),
+ slaveId,
+ info.id()),
frameworkId,
slaveId,
protobuf::createUUID(uuid.get()));
@@ -3072,17 +3077,19 @@ void StorageLocalResourceProviderProcess::dropOperation(
UpdateOperationStatusMessage update =
protobuf::createUpdateOperationStatusMessage(
- protobuf::createUUID(operationUuid),
- protobuf::createOperationStatus(
- OPERATION_DROPPED,
- operation.isSome() && operation->has_id()
- ? operation->id() : Option<OperationID>::none(),
- message,
- None(),
- id::UUID::random()),
- None(),
- frameworkId,
- slaveId);
+ protobuf::createUUID(operationUuid),
+ protobuf::createOperationStatus(
+ OPERATION_DROPPED,
+ operation.isSome() && operation->has_id()
+ ? operation->id() : Option<OperationID>::none(),
+ message,
+ None(),
+ id::UUID::random(),
+ slaveId,
+ info.id()),
+ None(),
+ frameworkId,
+ slaveId);
auto die = [=](const string& message) {
LOG(ERROR)
@@ -3318,14 +3325,15 @@ Try<Nothing> StorageLocalResourceProviderProcess::updateOperationStatus(
error = conversions.error();
}
- operation.mutable_latest_status()->CopyFrom(
- protobuf::createOperationStatus(
- error.isNone() ? OPERATION_FINISHED : OPERATION_FAILED,
- operation.info().has_id()
- ? operation.info().id() : Option<OperationID>::none(),
- error.isNone() ? Option<string>::none() : error->message,
- error.isNone() ? convertedResources : Option<Resources>::none(),
- id::UUID::random()));
+ operation.mutable_latest_status()->CopyFrom(protobuf::createOperationStatus(
+ error.isNone() ? OPERATION_FINISHED : OPERATION_FAILED,
+ operation.info().has_id()
+ ? operation.info().id() : Option<OperationID>::none(),
+ error.isNone() ? Option<string>::none() : error->message,
+ error.isNone() ? convertedResources : Option<Resources>::none(),
+ id::UUID::random(),
+ slaveId,
+ info.id()));
operation.add_statuses()->CopyFrom(operation.latest_status());
diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 50a4729..1d5dad5 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -4364,13 +4364,20 @@ void Slave::applyOperation(const ApplyOperationMessage& message)
return;
}
- Operation* operation = new Operation(
- protobuf::createOperation(
- message.operation_info(),
- protobuf::createOperationStatus(OPERATION_PENDING, operationId),
- frameworkId,
+ Operation* operation = new Operation(protobuf::createOperation(
+ message.operation_info(),
+ protobuf::createOperationStatus(
+ OPERATION_PENDING,
+ operationId,
+ None(),
+ None(),
+ None(),
info.id(),
- uuid));
+ resourceProviderId.isSome()
+ ? resourceProviderId.get() : Option<ResourceProviderID>::none()),
+ frameworkId,
+ info.id(),
+ uuid));
addOperation(operation);
@@ -4401,7 +4408,15 @@ void Slave::applyOperation(const ApplyOperationMessage& message)
UpdateOperationStatusMessage update =
protobuf::createUpdateOperationStatusMessage(
uuid,
- protobuf::createOperationStatus(OPERATION_FINISHED, operationId),
+ protobuf::createOperationStatus(
+ OPERATION_FINISHED,
+ operationId,
+ None(),
+ None(),
+ None(),
+ info.id(),
+ resourceProviderId.isSome()
+ ? resourceProviderId.get() : Option<ResourceProviderID>::none()),
None(),
frameworkId,
info.id());
@@ -4442,7 +4457,13 @@ void Slave::reconcileOperations(const ReconcileOperationsMessage& message)
UpdateOperationStatusMessage update =
protobuf::createUpdateOperationStatusMessage(
operation.operation_uuid(),
- protobuf::createOperationStatus(OPERATION_DROPPED),
+ protobuf::createOperationStatus(
+ OPERATION_DROPPED,
+ None(),
+ None(),
+ None(),
+ None(),
+ info.id()),
None(),
None(),
info.id());
@@ -7794,6 +7815,10 @@ void Slave::handleResourceProviderMessage(
message->updateOperationStatus->update;
update.mutable_slave_id()->CopyFrom(info.id());
+ update.mutable_status()->mutable_slave_id()->CopyFrom(info.id());
+ if (update.has_latest_status()) {
+ update.mutable_latest_status()->mutable_slave_id()->CopyFrom(info.id());
+ }
const UUID& operationUUID = update.operation_uuid();
diff --git a/src/tests/master_tests.cpp b/src/tests/master_tests.cpp
index 9d5d5a3..4467799 100644
--- a/src/tests/master_tests.cpp
+++ b/src/tests/master_tests.cpp
@@ -9030,6 +9030,135 @@ TEST_F(MasterTest, OperationUpdateDuringFailover)
}
+// This test verifies that operation status updates contain the
+// agent ID and resource provider ID of originating providers.
+TEST_F(MasterTest, OperationUpdateResourceProvider)
+{
+ Clock::pause();
+
+ master::Flags masterFlags = CreateMasterFlags();
+
+ Try<Owned<cluster::Master>> master = StartMaster(masterFlags);
+ ASSERT_SOME(master);
+
+ slave::Flags slaveFlags = CreateSlaveFlags();
+
+ Future<UpdateSlaveMessage> updateSlaveMessage =
+ FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _);
+
+ StandaloneMasterDetector detector(master.get()->pid);
+ Try<Owned<cluster::Slave>> slave = StartSlave(&detector, slaveFlags);
+ ASSERT_SOME(slave);
+
+ Clock::advance(slaveFlags.registration_backoff_factor);
+
+ AWAIT_READY(updateSlaveMessage);
+
+ // Register a resource provider with the agent.
+ mesos::v1::ResourceProviderInfo resourceProviderInfo;
+ resourceProviderInfo.set_type("org.apache.mesos.resource_provider.test");
+ resourceProviderInfo.set_name("test");
+
+ v1::Resources resourceProviderResources = v1::createDiskResource(
+ "200", "*", None(), None(), v1::createDiskSourceRaw(None(), "profile"));
+
+ v1::MockResourceProvider resourceProvider(
+ resourceProviderInfo, resourceProviderResources);
+
+ Owned<EndpointDetector> endpointDetector(
+ resource_provider::createEndpointDetector(slave.get()->pid));
+
+ updateSlaveMessage = FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _);
+
+ resourceProvider.start(endpointDetector, ContentType::PROTOBUF);
+
+ AWAIT_READY(updateSlaveMessage);
+
+ const v1::AgentID agentId = evolve(updateSlaveMessage->slave_id());
+
+ ASSERT_TRUE(resourceProvider.info.has_id());
+ const v1::ResourceProviderID resourceProviderId = resourceProvider.info.id();
+
+ // Start a framework to operate on offers.
+ auto scheduler = std::make_shared<v1::MockHTTPScheduler>();
+
+ EXPECT_CALL(*scheduler, connected(_))
+ .WillOnce(v1::scheduler::SendSubscribe(v1::DEFAULT_FRAMEWORK_INFO));
+
+ v1::scheduler::TestMesos driver(
+ master.get()->pid, ContentType::PROTOBUF, scheduler);
+
+ Future<Event::Subscribed> subscribed;
+ EXPECT_CALL(*scheduler, subscribed(_, _))
+ .WillOnce(FutureArg<1>(&subscribed));
+
+ EXPECT_CALL(*scheduler, heartbeat(_))
+ .WillRepeatedly(Return()); // Ignore heartbeats.
+
+ Future<Event::Offers> offers;
+ EXPECT_CALL(*scheduler, offers(_, _))
+ .WillOnce(FutureArg<1>(&offers));
+
+ AWAIT_READY(subscribed);
+
+ const v1::FrameworkID& frameworkId = subscribed->framework_id();
+
+ Clock::settle();
+ Clock::advance(masterFlags.allocation_interval);
+
+ AWAIT_READY(offers);
+
+ Future<Event::UpdateOperationStatus> updateOperationStatus;
+ EXPECT_CALL(*scheduler, updateOperationStatus(_, _))
+ .WillOnce(FutureArg<1>(&updateOperationStatus));
+
+ ASSERT_FALSE(offers->offers().empty());
+ const v1::Offer& offer = offers->offers(0);
+
+ // Perform an operation against the resource provider resources.
+ Option<v1::Resource> resource;
+ foreach (const v1::Resource& resource_, offer.resources()) {
+ if (resource_.has_provider_id()) {
+ resource = resource_;
+ break;
+ }
+ }
+
+ ASSERT_SOME(resource);
+
+ {
+ Call call;
+ call.mutable_framework_id()->CopyFrom(frameworkId);
+ call.set_type(Call::ACCEPT);
+
+ Call::Accept* accept = call.mutable_accept();
+ accept->add_offer_ids()->CopyFrom(offer.id());
+
+ v1::Offer::Operation* operation = accept->add_operations();
+ operation->set_type(v1::Offer::Operation::CREATE_DISK);
+ operation->mutable_id()->set_value("create_disk");
+
+ v1::Offer::Operation::CreateDisk* createDisk =
+ operation->mutable_create_disk();
+ createDisk->mutable_source()->CopyFrom(resource.get());
+ createDisk->set_target_type(v1::Resource::DiskInfo::Source::MOUNT);
+
+ driver.send(call);
+ }
+
+ AWAIT_READY(updateOperationStatus);
+
+ const v1::OperationStatus& status = updateOperationStatus->status();
+ ASSERT_EQ("create_disk", status.operation_id().value());
+
+ ASSERT_TRUE(status.has_agent_id());
+ EXPECT_EQ(agentId, status.agent_id());
+
+ ASSERT_TRUE(status.has_resource_provider_id());
+ EXPECT_EQ(resourceProviderId, status.resource_provider_id());
+}
+
+
// Tests that the master correctly drops an operation if the operation's 'id'
// field is set and the operation affects resources not managed by a resource
// provider.
diff --git a/src/tests/mesos.hpp b/src/tests/mesos.hpp
index e4a1ab4..60bd6e0 100644
--- a/src/tests/mesos.hpp
+++ b/src/tests/mesos.hpp
@@ -3203,7 +3203,12 @@ public:
break;
}
- update->mutable_latest_status()->CopyFrom(update->status());
+ if (update->has_status()) {
+ update->mutable_status()->mutable_resource_provider_id()->CopyFrom(
+ info.id());
+
+ update->mutable_latest_status()->CopyFrom(update->status());
+ }
driver->send(call);
}
[mesos] 01/06: Renamed a function argument to not reuse member name.
Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git
commit 58548afa35ae8a4e8cf1efc3652712788d04fdf9
Author: Benjamin Bannier <bb...@apache.org>
AuthorDate: Wed Oct 24 09:59:03 2018 +0200
Renamed a function argument to not reuse member name.
While a function argument shadowing a member variable is perfectly
legal, it is still confusing. In this patch we rename the function
argument to remove one case of such shadowing.
Review: https://reviews.apache.org/r/69161
---
src/resource_provider/storage/provider.cpp | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp
index 2d8d08d..e931300 100644
--- a/src/resource_provider/storage/provider.cpp
+++ b/src/resource_provider/storage/provider.cpp
@@ -414,7 +414,7 @@ private:
void dropOperation(
const id::UUID& operationUuid,
const Option<FrameworkID>& frameworkId,
- const Option<Offer::Operation>& info,
+ const Option<Offer::Operation>& operation,
const string& message);
Future<vector<ResourceConversion>> applyCreateDisk(
@@ -3064,7 +3064,7 @@ Future<Nothing> StorageLocalResourceProviderProcess::_applyOperation(
void StorageLocalResourceProviderProcess::dropOperation(
const id::UUID& operationUuid,
const Option<FrameworkID>& frameworkId,
- const Option<Offer::Operation>& info,
+ const Option<Offer::Operation>& operation,
const string& message)
{
LOG(WARNING)
@@ -3075,8 +3075,8 @@ void StorageLocalResourceProviderProcess::dropOperation(
protobuf::createUUID(operationUuid),
protobuf::createOperationStatus(
OPERATION_DROPPED,
- info.isSome() && info->has_id()
- ? info->id() : Option<OperationID>::none(),
+ operation.isSome() && operation->has_id()
+ ? operation->id() : Option<OperationID>::none(),
message,
None(),
id::UUID::random()),
@@ -3096,7 +3096,7 @@ void StorageLocalResourceProviderProcess::dropOperation(
.onDiscarded(defer(self(), std::bind(die, "future discarded")));
++metrics.operations_dropped.at(
- info.isSome() ? info->type() : Offer::Operation::UNKNOWN);
+ operation.isSome() ? operation->type() : Offer::Operation::UNKNOWN);
}
[mesos] 06/06: Added MESOS-9293 to the 1.7.1 CHANGELOG.
Posted by ch...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
chhsiao pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git
commit ce10fa8ab525054d4152d0273972bc7b2b5cb782
Author: Chun-Hung Hsiao <ch...@mesosphere.io>
AuthorDate: Wed Dec 5 13:10:13 2018 -0800
Added MESOS-9293 to the 1.7.1 CHANGELOG.
---
CHANGELOG | 1 +
1 file changed, 1 insertion(+)
diff --git a/CHANGELOG b/CHANGELOG
index 7996541..d530cb7 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -22,6 +22,7 @@ Release Notes - Mesos - Version 1.7.1 (WIP)
* [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
* [MESOS-9281] - SLRP gets a stale checkpoint after system crash.
* [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
+ * [MESOS-9293] - If a framework looses operation information it cannot reconcile to acknowledge updates.
* [MESOS-9295] - Nested container launch could fail if the agent upgrade with new cgroup subsystems.
* [MESOS-9308] - URI disk profile adaptor could deadlock.
* [MESOS-9317] - Some master endpoints do not handle failed authorization properly.