You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mesos.apache.org by Du Li <du...@ericsson.com> on 2013/09/19 00:01:16 UTC

tgz distribution of frameworks in mesos

I was trying the latest code at https://github.com/apache/mesos. Take spark for example. I created a tgz of spark binary and put it on HDFS. After a job is submitted, it is decomposed into many tasks. For each task, the assigned mesos slave downloads the tgz, unzips it, and executes some script to launch the task. This  seems very wasteful and unnecessary.

Does the following suggestion make sense? When a spark job is submitted, the spark/mesos master calculates a checksum or something the like for the tgz distribution. Then the checksum is sent to the slaves when tasks are assigned. If the same file has been downloaded/unzipped, a slave directly launches the task. This way the tgz is processed at most once for each job (which may have thousands of tasks). The aggregated saving would be tremendous.

Let me know if you have already considered/evaluated this scheme.

Du

Re: tgz distribution of frameworks in mesos

Posted by Vinod Kone <vi...@gmail.com>.

We definitely thought about this. I'm not sure if we have a ticket to track
this. Feel free to create a ticket.


On Wed, Sep 18, 2013 at 3:01 PM, Du Li <du...@ericsson.com> wrote:

> I was trying the latest code at https://github.com/apache/mesos. Take
> spark for example. I created a tgz of spark binary and put it on HDFS.
> After a job is submitted, it is decomposed into many tasks. For each task,
> the assigned mesos slave downloads the tgz, unzips it, and executes some
> script to launch the task. This  seems very wasteful and unnecessary.
>
> Does the following suggestion make sense? When a spark job is submitted,
> the spark/mesos master calculates a checksum or something the like for the
> tgz distribution. Then the checksum is sent to the slaves when tasks are
> assigned. If the same file has been downloaded/unzipped, a slave directly
> launches the task. This way the tgz is processed at most once for each job
> (which may have thousands of tasks). The aggregated saving would be
> tremendous.
>
> Let me know if you have already considered/evaluated this scheme.
>
> Du
>