You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Vaibhav Khanduja (JIRA)" <ji...@apache.org> on 2015/04/30 00:40:06 UTC

[jira] [Commented] (MESOS-1554) Persistent resources support for storage-like services

    [ https://issues.apache.org/jira/browse/MESOS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520430#comment-14520430 ] 

Vaibhav Khanduja commented on MESOS-1554:
-----------------------------------------

Managing SAN or iSCSI or even Amazon EBS volumes are something which should be worked on. There are number of scenarios which would require interacting with backend storage, from initial provisioning to expansion of space. A framework that can connect with such services in backend can be build with callbacks or hooks or extensions in the executors. The garbage collection or releasing of resources is something which can be be asynchronous or times activity scheduled, something similar to java jvm. Such schedule gc shall also enable use of extended data services on the backend data.

> Persistent resources support for storage-like services
> ------------------------------------------------------
>
>                 Key: MESOS-1554
>                 URL: https://issues.apache.org/jira/browse/MESOS-1554
>             Project: Mesos
>          Issue Type: Epic
>          Components: general, hadoop
>            Reporter: Nikita Vetoshkin
>            Priority: Minor
>              Labels: twitter
>
> This question came up in [dev mailing list|http://mail-archives.apache.org/mod_mbox/mesos-dev/201406.mbox/%3CCAK8jAgNDs9Fe011Sq1jeNr0h%3DE-tDD9rak6hAsap3PqHx1y%3DKQ%40mail.gmail.com%3E].
> It seems reasonable for storage like services (e.g. HDFS or Cassandra) to use Mesos to manage it's instances. But right now if we'd like to restart instance (e.g. to spin up a new version) - all previous instance version sandbox filesystem resources will be recycled by slave's garbage collector.
> At the moment filesystem resources can be managed out of band - i.e. instances can save their data in some database specific placed, that various instances can share (e.g. {{/var/lib/cassandra}}).
> [~benjaminhindman] suggested an idea in the mailing list (though it still needs some fleshing out):
> {quote}
> The idea originally came about because, even today, if we allocate some
> file system space to a task/executor, and then that task/executor
> terminates, we haven't officially "freed" those file system resources until
> after we garbage collect the task/executor sandbox! (We keep the sandbox
> around so a user/operator can get the stdout/stderr or anything else left
> around from their task/executor.)
> To solve this problem we wanted to be able to let a task/executor terminate
> but not *give up* all of it's resources, hence: persistent resources.
> Pushing this concept even further you could imagine always reallocating
> resources to a framework that had already been allocated those resources
> for a previous task/executor. Looked at from another perspective, these are
> "late-binding", or "lazy", resource reservations.
> At one point in time we had considered just doing 'right-of-first-refusal'
> for allocations after a task/executor terminate. But this is really
> insufficient for supporting storage-like frameworks well (and likely even
> harder to reliably implement then 'persistent resources' IMHO).
> There are a ton of things that need to get worked out in this model,
> including (but not limited to), how should a file system (or disk) be
> exposed in order to be made persistent? How should persistent resources be
> returned to a master? How many persistent resources can a framework get
> allocated?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)