You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by 上西康太 <ue...@nautilus-technologies.com> on 2016/07/05 07:24:11 UTC

Fetcher cache: caching even more while an executor is alive

Hi,
I'm developing my own framework - that distributes >100 independent
tasks across the cluster and just run them arbitrarily. My problem is,
each task execution environment is a bit large tarball (2~6GB, mostly
application jar files) and task itself finishes within 1~200 seconds,
while tarball extraction takes like tens of seconds every time.
Extracting the same tarball again and again in all tasks is a wasteful
overhead that cannot be ignored.

Fetcher cache is great, but in my case, fetcher cache isn't even
enough and I want to preserve all files extracted from the tarball
while my executor is alive. If Mesos could cache all files extracted
from the tarball by omitting not only download but extraction, I could
save more time.

In "Fetcher Cache Internals" [1] or in "Fetcher Cache" [2] section in
the official document, such issues or future work is not mentioned -
how do you solve this kind of extraction overhead problem, when you
have rather large resource ?

An option would be setting up an internal docker registry and let
slaves cache the docker image that includes our jar files and save
tarball extraction. But, I want to prevent our system from additional
moving parts as much as I can.

Another option might be let fetcher fetch all jar files independently
in slaves, but I think it feasible, but I don't think it manageable in
production in an easy way.

PS Mesos is great; it is helping us a lot - I want to appreciate all
the efforts by the community. Thank you so much!

[1] http://mesos.apache.org/documentation/latest/fetcher-cache-internals/
[2] http://mesos.apache.org/documentation/latest/fetcher/

Kota UENISHI

Re: Fetcher cache: caching even more while an executor is alive

Posted by Pradeep Chhetri <pr...@gmail.com>.

Just a random thought, have you tried something like bittorrent based
deployments. They are really efficient when you have to distribute big
artifacts across cluster of machines. Following two projects might be
helpful in achieving that:

1. http://erdgeist.org/arts/software/opentracker/

2. http://sourceforge.net/projects/ctorrent/
On Jul 7, 2016 10:58 PM, "Dick Davies" <di...@hellooperator.net> wrote:

> I'd try the Docker image approach.
> We've done this in the past and used our CM tool to 'seed' all slaves
> by running 'docker pull foo:v1'  across them all in advance, saved a
> lot of startup time (although we were only dealing with a Gb or so of
> dependencies).
>
> On 5 July 2016 at 11:23, Kota UENISHI <ue...@nautilus-technologies.com>
> wrote:
> > Thanks, it looks promising to me - I was aware of persistent volumes
> > because I thought the use case was different, like for databases. I'll
> > try it on.
> >
> > As the document says
> >
> >> persistent volumes are associated with roles,
> >
> > this makes failure handling a little bit difficult - As my framework
> > is not handling failure well enough, those volume IDs must be
> > remembered during framework restart or failover, or must get recovered
> > after.  Restarted framework must reuse or collect already reserved
> > volumes or those volumes just gets leaking.
> >
> > Kota UENISHI
> >
> >
> > On Tue, Jul 5, 2016 at 6:03 PM, Aaron Carey <ac...@ilm.com> wrote:
> >> As you're writing the framework, have you looked at reserving
> persistent volumes? I think it might help in your use case:
> >>
> >> http://mesos.apache.org/documentation/latest/persistent-volume/
> >>
> >> Aaron
> >>
> >> --
> >>
> >> Aaron Carey
> >> Production Engineer - Cloud Pipeline
> >> Industrial Light & Magic
> >> London
> >> 020 3751 9150
> >>
> >> ________________________________________
> >> From: 上西康太 [uenishi@nautilus-technologies.com]
> >> Sent: 05 July 2016 08:24
> >> To: user@mesos.apache.org
> >> Subject: Fetcher cache: caching even more while an executor is alive
> >>
> >> Hi,
> >> I'm developing my own framework - that distributes >100 independent
> >> tasks across the cluster and just run them arbitrarily. My problem is,
> >> each task execution environment is a bit large tarball (2~6GB, mostly
> >> application jar files) and task itself finishes within 1~200 seconds,
> >> while tarball extraction takes like tens of seconds every time.
> >> Extracting the same tarball again and again in all tasks is a wasteful
> >> overhead that cannot be ignored.
> >>
> >> Fetcher cache is great, but in my case, fetcher cache isn't even
> >> enough and I want to preserve all files extracted from the tarball
> >> while my executor is alive. If Mesos could cache all files extracted
> >> from the tarball by omitting not only download but extraction, I could
> >> save more time.
> >>
> >> In "Fetcher Cache Internals" [1] or in "Fetcher Cache" [2] section in
> >> the official document, such issues or future work is not mentioned -
> >> how do you solve this kind of extraction overhead problem, when you
> >> have rather large resource ?
> >>
> >> An option would be setting up an internal docker registry and let
> >> slaves cache the docker image that includes our jar files and save
> >> tarball extraction. But, I want to prevent our system from additional
> >> moving parts as much as I can.
> >>
> >> Another option might be let fetcher fetch all jar files independently
> >> in slaves, but I think it feasible, but I don't think it manageable in
> >> production in an easy way.
> >>
> >> PS Mesos is great; it is helping us a lot - I want to appreciate all
> >> the efforts by the community. Thank you so much!
> >>
> >> [1]
> http://mesos.apache.org/documentation/latest/fetcher-cache-internals/
> >> [2] http://mesos.apache.org/documentation/latest/fetcher/
> >>
> >> Kota UENISHI
>

Re: Fetcher cache: caching even more while an executor is alive

Posted by Dick Davies <di...@hellooperator.net>.

I'd try the Docker image approach.
We've done this in the past and used our CM tool to 'seed' all slaves
by running 'docker pull foo:v1'  across them all in advance, saved a
lot of startup time (although we were only dealing with a Gb or so of
dependencies).

On 5 July 2016 at 11:23, Kota UENISHI <ue...@nautilus-technologies.com> wrote:
> Thanks, it looks promising to me - I was aware of persistent volumes
> because I thought the use case was different, like for databases. I'll
> try it on.
>
> As the document says
>
>> persistent volumes are associated with roles,
>
> this makes failure handling a little bit difficult - As my framework
> is not handling failure well enough, those volume IDs must be
> remembered during framework restart or failover, or must get recovered
> after.  Restarted framework must reuse or collect already reserved
> volumes or those volumes just gets leaking.
>
> Kota UENISHI
>
>
> On Tue, Jul 5, 2016 at 6:03 PM, Aaron Carey <ac...@ilm.com> wrote:
>> As you're writing the framework, have you looked at reserving persistent volumes? I think it might help in your use case:
>>
>> http://mesos.apache.org/documentation/latest/persistent-volume/
>>
>> Aaron
>>
>> --
>>
>> Aaron Carey
>> Production Engineer - Cloud Pipeline
>> Industrial Light & Magic
>> London
>> 020 3751 9150
>>
>> ________________________________________
>> From: 上西康太 [uenishi@nautilus-technologies.com]
>> Sent: 05 July 2016 08:24
>> To: user@mesos.apache.org
>> Subject: Fetcher cache: caching even more while an executor is alive
>>
>> Hi,
>> I'm developing my own framework - that distributes >100 independent
>> tasks across the cluster and just run them arbitrarily. My problem is,
>> each task execution environment is a bit large tarball (2~6GB, mostly
>> application jar files) and task itself finishes within 1~200 seconds,
>> while tarball extraction takes like tens of seconds every time.
>> Extracting the same tarball again and again in all tasks is a wasteful
>> overhead that cannot be ignored.
>>
>> Fetcher cache is great, but in my case, fetcher cache isn't even
>> enough and I want to preserve all files extracted from the tarball
>> while my executor is alive. If Mesos could cache all files extracted
>> from the tarball by omitting not only download but extraction, I could
>> save more time.
>>
>> In "Fetcher Cache Internals" [1] or in "Fetcher Cache" [2] section in
>> the official document, such issues or future work is not mentioned -
>> how do you solve this kind of extraction overhead problem, when you
>> have rather large resource ?
>>
>> An option would be setting up an internal docker registry and let
>> slaves cache the docker image that includes our jar files and save
>> tarball extraction. But, I want to prevent our system from additional
>> moving parts as much as I can.
>>
>> Another option might be let fetcher fetch all jar files independently
>> in slaves, but I think it feasible, but I don't think it manageable in
>> production in an easy way.
>>
>> PS Mesos is great; it is helping us a lot - I want to appreciate all
>> the efforts by the community. Thank you so much!
>>
>> [1] http://mesos.apache.org/documentation/latest/fetcher-cache-internals/
>> [2] http://mesos.apache.org/documentation/latest/fetcher/
>>
>> Kota UENISHI

Re: Fetcher cache: caching even more while an executor is alive

Posted by Kota UENISHI <ue...@nautilus-technologies.com>.

Thanks, it looks promising to me - I was aware of persistent volumes
because I thought the use case was different, like for databases. I'll
try it on.

As the document says

> persistent volumes are associated with roles,

this makes failure handling a little bit difficult - As my framework
is not handling failure well enough, those volume IDs must be
remembered during framework restart or failover, or must get recovered
after.  Restarted framework must reuse or collect already reserved
volumes or those volumes just gets leaking.

Kota UENISHI


On Tue, Jul 5, 2016 at 6:03 PM, Aaron Carey <ac...@ilm.com> wrote:
> As you're writing the framework, have you looked at reserving persistent volumes? I think it might help in your use case:
>
> http://mesos.apache.org/documentation/latest/persistent-volume/
>
> Aaron
>
> --
>
> Aaron Carey
> Production Engineer - Cloud Pipeline
> Industrial Light & Magic
> London
> 020 3751 9150
>
> ________________________________________
> From: 上西康太 [uenishi@nautilus-technologies.com]
> Sent: 05 July 2016 08:24
> To: user@mesos.apache.org
> Subject: Fetcher cache: caching even more while an executor is alive
>
> Hi,
> I'm developing my own framework - that distributes >100 independent
> tasks across the cluster and just run them arbitrarily. My problem is,
> each task execution environment is a bit large tarball (2~6GB, mostly
> application jar files) and task itself finishes within 1~200 seconds,
> while tarball extraction takes like tens of seconds every time.
> Extracting the same tarball again and again in all tasks is a wasteful
> overhead that cannot be ignored.
>
> Fetcher cache is great, but in my case, fetcher cache isn't even
> enough and I want to preserve all files extracted from the tarball
> while my executor is alive. If Mesos could cache all files extracted
> from the tarball by omitting not only download but extraction, I could
> save more time.
>
> In "Fetcher Cache Internals" [1] or in "Fetcher Cache" [2] section in
> the official document, such issues or future work is not mentioned -
> how do you solve this kind of extraction overhead problem, when you
> have rather large resource ?
>
> An option would be setting up an internal docker registry and let
> slaves cache the docker image that includes our jar files and save
> tarball extraction. But, I want to prevent our system from additional
> moving parts as much as I can.
>
> Another option might be let fetcher fetch all jar files independently
> in slaves, but I think it feasible, but I don't think it manageable in
> production in an easy way.
>
> PS Mesos is great; it is helping us a lot - I want to appreciate all
> the efforts by the community. Thank you so much!
>
> [1] http://mesos.apache.org/documentation/latest/fetcher-cache-internals/
> [2] http://mesos.apache.org/documentation/latest/fetcher/
>
> Kota UENISHI

RE: Fetcher cache: caching even more while an executor is alive

Posted by Aaron Carey <ac...@ilm.com>.

As you're writing the framework, have you looked at reserving persistent volumes? I think it might help in your use case:

http://mesos.apache.org/documentation/latest/persistent-volume/

Aaron

--

Aaron Carey
Production Engineer - Cloud Pipeline
Industrial Light & Magic
London
020 3751 9150

________________________________________
From: 上西康太 [uenishi@nautilus-technologies.com]
Sent: 05 July 2016 08:24
To: user@mesos.apache.org
Subject: Fetcher cache: caching even more while an executor is alive

Hi,
I'm developing my own framework - that distributes >100 independent
tasks across the cluster and just run them arbitrarily. My problem is,
each task execution environment is a bit large tarball (2~6GB, mostly
application jar files) and task itself finishes within 1~200 seconds,
while tarball extraction takes like tens of seconds every time.
Extracting the same tarball again and again in all tasks is a wasteful
overhead that cannot be ignored.

Fetcher cache is great, but in my case, fetcher cache isn't even
enough and I want to preserve all files extracted from the tarball
while my executor is alive. If Mesos could cache all files extracted
from the tarball by omitting not only download but extraction, I could
save more time.

In "Fetcher Cache Internals" [1] or in "Fetcher Cache" [2] section in
the official document, such issues or future work is not mentioned -
how do you solve this kind of extraction overhead problem, when you
have rather large resource ?

An option would be setting up an internal docker registry and let
slaves cache the docker image that includes our jar files and save
tarball extraction. But, I want to prevent our system from additional
moving parts as much as I can.

Another option might be let fetcher fetch all jar files independently
in slaves, but I think it feasible, but I don't think it manageable in
production in an easy way.

PS Mesos is great; it is helping us a lot - I want to appreciate all
the efforts by the community. Thank you so much!

[1] http://mesos.apache.org/documentation/latest/fetcher-cache-internals/
[2] http://mesos.apache.org/documentation/latest/fetcher/

Kota UENISHI