You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Anindya Sinha (JIRA)" <ji...@apache.org> on 2016/10/21 18:48:58 UTC

[jira] [Created] (MESOS-6444) Ensure single copy of shared count of total resources in role sorter.

Anindya Sinha created MESOS-6444:
------------------------------------

             Summary: Ensure single copy of shared count of total resources in role sorter.
                 Key: MESOS-6444
                 URL: https://issues.apache.org/jira/browse/MESOS-6444
             Project: Mesos
          Issue Type: Bug
          Components: general
            Reporter: Anindya Sinha
            Assignee: Anindya Sinha


We maintain a single copy of shared resource in the role and quota sorter's total resources. So, when we update these resources, we need to  ensure that we only count a single copy even though the framework sorter may be returned multiple copies of a shared resource.

If not, then we may fail here in void DRFSorter::remove(const SlaveID& slaveId, const Resources& resources):
    CHECK(total_.resources[slaveId].contains(resources));

2 scenarios where this can happen:
(1) Framework does a RESERVE, CREATE of shared volume and LAUNCH of a task using shared volume in a single ACCEPT. On subsequent offer, it does another set of RESERVE, CREATE and LAUNCH which would hit this condition.

(2) Say we have a framework of a certain role which has been offered 2 persistent volumes, say PV1 (regular persistent volume), and PV2 (shared persistent volume).
Launch a long lived task using PV2.
Launch a short lived task using PV1.
PV1 terminates, and then issue a DESTROY on PV1 => Fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)