You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Michael Park (JIRA)" <ji...@apache.org> on 2015/12/04 10:06:11 UTC
[jira] [Commented] (MESOS-4067)
ReservationTest.ACLMultipleOperations is flaky
[ https://issues.apache.org/jira/browse/MESOS-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041306#comment-15041306 ]
Michael Park commented on MESOS-4067:
-------------------------------------
I was able to figure out one issue (not sure if there are more issues, or if the subsequent failures are all stemmed from this one):
{code}
// Attempt to unreserve an invalid set of resources (not dynamically
// reserved), reserve the second set, and launch a task.
driver.acceptOffers({offer.id()},
{UNRESERVE(unreserved1),
RESERVE(dynamicallyReserved2),
LAUNCH({taskInfo})},
filters);
// Wait for TASK_FINISHED update ack.
AWAIT_READY(statusUpdateAcknowledgement);
EXPECT_EQ(TASK_FINISHED, statusUpdateAcknowledgement.get().state());
// In the next offer, expect to find both sets of reserved
// resources, since the Unreserve operation should fail.
AWAIT_READY(offers);
ASSERT_EQ(1u, offers.get().size());
offer = offers.get()[0];
EXPECT_TRUE(
Resources(offer.resources()).contains(
dynamicallyReserved1 +
dynamicallyReserved2 +
unreserved2));
{code}
The intention here seems to be: Perform an {{acceptOffers}} with a sequence of operations including a launch task, wait until the launch task has finished and therefore the resources recovered. Then expect all of the available resources to be offered in a single offer.
The issue is that at 50ms as our {{allocation_interval}}, we can make an offer with the available resources while the task is being launched, running, etc. This premature offer is picked up by our {{EXPECT_CALL}} for {{resourceOffers}} and we don't meet our expectation of receiving an offer with {{dynamicallyReserved1 + dynamicallyReserved2 + unreserved2}}.
A few possible approaches in my preferred order:
# We may not need all of these moving parts, and possibly just use one set of resources instead of three. Refer to {{ReservationTest.ReserveAndLaunchThenUnreserve}} for an example.
# Turn allocation off {{allocation_interval=1000s}} and use {{reviveOffers}} to manually control the offers. Refer to {{ReservationEndpointsTest.ReserveAvailableAndOfferedResources}} for an example.
# Instead of a simple {{FutureArg<1>(offers)}} as the action for {{EXPECT_CALL}} of {{resourceOffers}}, perhaps we can aggregate them instead. This one feels like it could get pretty tricky.
[~greggomann], [~jieyu] What are your thoughts?
> ReservationTest.ACLMultipleOperations is flaky
> ----------------------------------------------
>
> Key: MESOS-4067
> URL: https://issues.apache.org/jira/browse/MESOS-4067
> Project: Mesos
> Issue Type: Bug
> Reporter: Michael Park
> Labels: flaky, mesosphere
>
> Observed from the CI: https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=ubuntu%3A14.04,label_exp=docker%7C%7CHadoop/1319/changes
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)