You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Tamas Szuromi <ta...@odigeo.com.INVALID> on 2016/09/07 09:16:31 UTC

Mesos coarse-grained problem with spark.shuffle.service.enabled

Hello,

For a while, we're using Spark on Mesos with fine-grained mode in
production.
Since Spark 2.0 the fine-grained mode is deprecated so we'd shift to
dynamic allocation.

When I tried to setup the dynamic allocation I run into the following
problem:
So I set spark.shuffle.service.enabled = true
and spark.dynamicAllocation.enabled = true as the documentation said. We're
using Spark on Mesos with spark.executor.uri where we download the
pipeline's corresponding Spark version from HDFS. The documentation also
says In Mesos coarse-grained mode, run
$SPARK_HOME/sbin/start-mesos-shuffle-service.sh on all slave nodes. But how
is it possible to launch it before start the application, if the given
Spark will be downloaded to the Mesos executor after executor launch but
it's looking for the started external shuffle service in advance?

Is it possible I can't use spark.executor.uri and
spark.dynamicAllocation.enabled together?

Thanks in advance!

Tamas

Re: Mesos coarse-grained problem with spark.shuffle.service.enabled

Posted by Michael Gummelt <mg...@mesosphere.io>.

The shuffle service is run out of band from any specific Spark job, and you
only run one on any given node.  You need to get the Spark distribution on
each node somehow, then run the shuffle service out of that distribution.
The most common way I see people doing this is via Marathon (using the
"uris" field in the marathon app to download the Spark distribution).

On Wed, Sep 7, 2016 at 2:16 AM, Tamas Szuromi <
tamas.szuromi@odigeo.com.invalid> wrote:

> Hello,
>
> For a while, we're using Spark on Mesos with fine-grained mode in
> production.
> Since Spark 2.0 the fine-grained mode is deprecated so we'd shift to
> dynamic allocation.
>
> When I tried to setup the dynamic allocation I run into the following
> problem:
> So I set spark.shuffle.service.enabled = true and spark.dynamicAllocation.enabled
> = true as the documentation said. We're using Spark on Mesos
> with spark.executor.uri where we download the pipeline's
> corresponding Spark version from HDFS. The documentation also says In Mesos
> coarse-grained mode, run $SPARK_HOME/sbin/start-mesos-shuffle-service.sh
> on all slave nodes. But how is it possible to launch it before start the
> application, if the given Spark will be downloaded to the Mesos executor
> after executor launch but it's looking for the started external shuffle
> service in advance?
>
> Is it possible I can't use spark.executor.uri and spark.dynamicAllocation.
> enabled together?
>
> Thanks in advance!
>
> Tamas
>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere