You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Charles Allen <ch...@metamarkets.com> on 2015/04/01 01:00:13 UTC

SharedFilesystem and rolling restarts

I am working on potentially porting druid.io to a mesos framework. One
limitation for production use is that there is a lot of data cached locally
on disk that does not need to be re-fetched during a rolling restart.

If I were to take the simplest mesos route, each instance of the
disk-cache-heavy task would have its own executor and would have to refresh
the disk cache from deep storage each time it starts.

A more complex route would be to have a standalone executor which handles
the forking and restarts of tasks in order to maintain the working
directory of the task.

A slightly more hacky way of doing it would be to allow the
disk-cache-heavy task to each have their own executor but use a common
SharedFilesystem. But I'm not clear if SharedFilesystem would persist
beyond an executor's lifespan.

In such a case (where on-disk data would need to be "immediately" available
after a rolling restart) is there a recommended approach to making sure the
data persists properly?

Thanks,
Charles Allen

Re: SharedFilesystem and rolling restarts

Posted by Charles Allen <ch...@metamarkets.com>.
That looks like it would solve it, thanks!

On Tue, Mar 31, 2015 at 4:24 PM, Jie Yu <yu...@gmail.com> wrote:

> Hi, in Mesos 0.23.0 (the next release), you'll be able to use the
> persistence primitives provided by Mesos.
> https://issues.apache.org/jira/browse/MESOS-1554
>
> Basically, your framework can create a persistent volume (has a unique
> handle) while launching a task and re-use that handle when re-launching the
> task. The same persistent volume will be mounted into the sandbox. Any data
> stored in the volume will be persisted.
>
> - Jie
>
> On Tue, Mar 31, 2015 at 4:00 PM, Charles Allen <
> charles.allen@metamarkets.com> wrote:
>
>> I am working on potentially porting druid.io to a mesos framework. One
>> limitation for production use is that there is a lot of data cached locally
>> on disk that does not need to be re-fetched during a rolling restart.
>>
>> If I were to take the simplest mesos route, each instance of the
>> disk-cache-heavy task would have its own executor and would have to refresh
>> the disk cache from deep storage each time it starts.
>>
>> A more complex route would be to have a standalone executor which handles
>> the forking and restarts of tasks in order to maintain the working
>> directory of the task.
>>
>> A slightly more hacky way of doing it would be to allow the
>> disk-cache-heavy task to each have their own executor but use a common
>> SharedFilesystem. But I'm not clear if SharedFilesystem would persist
>> beyond an executor's lifespan.
>>
>> In such a case (where on-disk data would need to be "immediately"
>> available after a rolling restart) is there a recommended approach to
>> making sure the data persists properly?
>>
>> Thanks,
>> Charles Allen
>>
>>
>>
>


-- 

<http://www.metamarkets.com/> Charles Allen PhDSr. Software Engineer|
METAMARKETS <http://www.metamarkets.com/>m 765.490.0454|t @drcrallen
<https://twitter.com/drcrallen>charles.allen@metamarkets.com

Re: SharedFilesystem and rolling restarts

Posted by Jie Yu <yu...@gmail.com>.
Hi, in Mesos 0.23.0 (the next release), you'll be able to use the
persistence primitives provided by Mesos.
https://issues.apache.org/jira/browse/MESOS-1554

Basically, your framework can create a persistent volume (has a unique
handle) while launching a task and re-use that handle when re-launching the
task. The same persistent volume will be mounted into the sandbox. Any data
stored in the volume will be persisted.

- Jie

On Tue, Mar 31, 2015 at 4:00 PM, Charles Allen <
charles.allen@metamarkets.com> wrote:

> I am working on potentially porting druid.io to a mesos framework. One
> limitation for production use is that there is a lot of data cached locally
> on disk that does not need to be re-fetched during a rolling restart.
>
> If I were to take the simplest mesos route, each instance of the
> disk-cache-heavy task would have its own executor and would have to refresh
> the disk cache from deep storage each time it starts.
>
> A more complex route would be to have a standalone executor which handles
> the forking and restarts of tasks in order to maintain the working
> directory of the task.
>
> A slightly more hacky way of doing it would be to allow the
> disk-cache-heavy task to each have their own executor but use a common
> SharedFilesystem. But I'm not clear if SharedFilesystem would persist
> beyond an executor's lifespan.
>
> In such a case (where on-disk data would need to be "immediately"
> available after a rolling restart) is there a recommended approach to
> making sure the data persists properly?
>
> Thanks,
> Charles Allen
>
>
>