You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Xingbo Jiang <ji...@gmail.com> on 2020/03/02 19:15:00 UTC

Re: [DISCUSS] Remove multiple workers on the same host support from Standalone backend

Thanks Sean for your input, I really think it could simplify Spark
Standalone backend a lot by only allowing a single worker on the same host,
also I can confirm this deploy model can satisfy all the workloads deployed
on Standalone backend AFAIK.

Regarding the case multiple distinct Spark clusters running a worker on one
machine, I'm not sure whether that's something we have claimed to support,
could someone with more context on this scenario share their use case?

Cheers,

Xingbo

On Fri, Feb 28, 2020 at 11:29 AM Sean Owen <sr...@gmail.com> wrote:

> I'll admit, I didn't know you could deploy multiple workers per
> machine. I agree, I don't see the use case for it? multiple executors,
> yes of course. And I guess you could imagine multiple distinct Spark
> clusters running a worker on one machine. I don't have an informed
> opinion therefore, but agree that it seems like a best practice enough
> to enforce 1 worker per machine, if it makes things simpler rather
> than harder.
>
> On Fri, Feb 28, 2020 at 1:21 PM Xingbo Jiang <ji...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > Based on my experience, there is no scenario that necessarily requires
> deploying multiple Workers on the same node with Standalone backend. A
> worker should book all the resources reserved to Spark on the host it is
> launched, then it can allocate those resources to one or more executors
> launched by this worker. Since each executor runs in a separated JVM, we
> can limit the memory of each executor to avoid long GC pause.
> >
> > The remaining concern is the local-cluster mode is implemented by
> launching multiple workers on the local host, we might need to re-implement
> LocalSparkCluster to launch only one Worker and multiple executors. It
> should be fine because local-cluster mode is only used in running Spark
> unit test cases, thus end users should not be affected by this change.
> >
> > Removing multiple workers on the same host support could simplify the
> deploy model of Standalone backend, and also reduce the burden to support
> legacy deploy pattern in the future feature developments. (There is an
> example in https://issues.apache.org/jira/browse/SPARK-27371 , where we
> designed a complex approach to coordinate resource requirements from
> different workers launched on the same host).
> >
> > The proposal is to update the document to deprecate the support of
> system environment `SPARK_WORKER_INSTANCES` in Spark 3.0, and remove the
> support in the next major version (Spark 3.1).
> >
> > Please kindly let me know if you have use cases relying on this feature.
> >
> > Thanks!
> >
> > Xingbo
>

Re: [DISCUSS] Remove multiple workers on the same host support from Standalone backend

Posted by Xingbo Jiang <ji...@gmail.com>.

Hi Prashant,

I guess you are referring to the local-cluster mode? AFAIK the
local-cluster mode has not been mentioned at all in the user guide, thus it
should only be used in Spark tests. Also, there are a few differences
between having multiple workers on the same node and having one worker on
each node, as I mentioned in
https://issues.apache.org/jira/browse/SPARK-27371 , a complex approach is
needed to resolve the resource requirement contentions between different
workers running on the same node.

Cheers,

Xingbo

On Thu, Mar 5, 2020 at 8:49 PM Prashant Sharma <sc...@gmail.com> wrote:

> It was by design, one could run multiple workers on his laptop for trying
> out or testing spark in distributed mode, one could launch multiple workers
> and see how resource offers and requirements work. Certainly, I have not
> commonly seen, starting multiple workers on the same node as a practice so
> far.
>
> Why do we consider it as a special case for scheduling, where two workers
> are on the same node than two different nodes? Possibly, optimize on
> network I/o and disk I/O?
>
> On Tue, Mar 3, 2020 at 12:45 AM Xingbo Jiang <ji...@gmail.com>
> wrote:
>
>> Thanks Sean for your input, I really think it could simplify Spark
>> Standalone backend a lot by only allowing a single worker on the same host,
>> also I can confirm this deploy model can satisfy all the workloads deployed
>> on Standalone backend AFAIK.
>>
>> Regarding the case multiple distinct Spark clusters running a worker on
>> one machine, I'm not sure whether that's something we have claimed to
>> support, could someone with more context on this scenario share their use
>> case?
>>
>> Cheers,
>>
>> Xingbo
>>
>> On Fri, Feb 28, 2020 at 11:29 AM Sean Owen <sr...@gmail.com> wrote:
>>
>>> I'll admit, I didn't know you could deploy multiple workers per
>>> machine. I agree, I don't see the use case for it? multiple executors,
>>> yes of course. And I guess you could imagine multiple distinct Spark
>>> clusters running a worker on one machine. I don't have an informed
>>> opinion therefore, but agree that it seems like a best practice enough
>>> to enforce 1 worker per machine, if it makes things simpler rather
>>> than harder.
>>>
>>> On Fri, Feb 28, 2020 at 1:21 PM Xingbo Jiang <ji...@gmail.com>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > Based on my experience, there is no scenario that necessarily requires
>>> deploying multiple Workers on the same node with Standalone backend. A
>>> worker should book all the resources reserved to Spark on the host it is
>>> launched, then it can allocate those resources to one or more executors
>>> launched by this worker. Since each executor runs in a separated JVM, we
>>> can limit the memory of each executor to avoid long GC pause.
>>> >
>>> > The remaining concern is the local-cluster mode is implemented by
>>> launching multiple workers on the local host, we might need to re-implement
>>> LocalSparkCluster to launch only one Worker and multiple executors. It
>>> should be fine because local-cluster mode is only used in running Spark
>>> unit test cases, thus end users should not be affected by this change.
>>> >
>>> > Removing multiple workers on the same host support could simplify the
>>> deploy model of Standalone backend, and also reduce the burden to support
>>> legacy deploy pattern in the future feature developments. (There is an
>>> example in https://issues.apache.org/jira/browse/SPARK-27371 , where we
>>> designed a complex approach to coordinate resource requirements from
>>> different workers launched on the same host).
>>> >
>>> > The proposal is to update the document to deprecate the support of
>>> system environment `SPARK_WORKER_INSTANCES` in Spark 3.0, and remove the
>>> support in the next major version (Spark 3.1).
>>> >
>>> > Please kindly let me know if you have use cases relying on this
>>> feature.
>>> >
>>> > Thanks!
>>> >
>>> > Xingbo
>>>
>>

Re: [DISCUSS] Remove multiple workers on the same host support from Standalone backend

Posted by Prashant Sharma <sc...@gmail.com>.

It was by design, one could run multiple workers on his laptop for trying
out or testing spark in distributed mode, one could launch multiple workers
and see how resource offers and requirements work. Certainly, I have not
commonly seen, starting multiple workers on the same node as a practice so
far.

Why do we consider it as a special case for scheduling, where two workers
are on the same node than two different nodes? Possibly, optimize on
network I/o and disk I/O?

On Tue, Mar 3, 2020 at 12:45 AM Xingbo Jiang <ji...@gmail.com> wrote:

> Thanks Sean for your input, I really think it could simplify Spark
> Standalone backend a lot by only allowing a single worker on the same host,
> also I can confirm this deploy model can satisfy all the workloads deployed
> on Standalone backend AFAIK.
>
> Regarding the case multiple distinct Spark clusters running a worker on
> one machine, I'm not sure whether that's something we have claimed to
> support, could someone with more context on this scenario share their use
> case?
>
> Cheers,
>
> Xingbo
>
> On Fri, Feb 28, 2020 at 11:29 AM Sean Owen <sr...@gmail.com> wrote:
>
>> I'll admit, I didn't know you could deploy multiple workers per
>> machine. I agree, I don't see the use case for it? multiple executors,
>> yes of course. And I guess you could imagine multiple distinct Spark
>> clusters running a worker on one machine. I don't have an informed
>> opinion therefore, but agree that it seems like a best practice enough
>> to enforce 1 worker per machine, if it makes things simpler rather
>> than harder.
>>
>> On Fri, Feb 28, 2020 at 1:21 PM Xingbo Jiang <ji...@gmail.com>
>> wrote:
>> >
>> > Hi all,
>> >
>> > Based on my experience, there is no scenario that necessarily requires
>> deploying multiple Workers on the same node with Standalone backend. A
>> worker should book all the resources reserved to Spark on the host it is
>> launched, then it can allocate those resources to one or more executors
>> launched by this worker. Since each executor runs in a separated JVM, we
>> can limit the memory of each executor to avoid long GC pause.
>> >
>> > The remaining concern is the local-cluster mode is implemented by
>> launching multiple workers on the local host, we might need to re-implement
>> LocalSparkCluster to launch only one Worker and multiple executors. It
>> should be fine because local-cluster mode is only used in running Spark
>> unit test cases, thus end users should not be affected by this change.
>> >
>> > Removing multiple workers on the same host support could simplify the
>> deploy model of Standalone backend, and also reduce the burden to support
>> legacy deploy pattern in the future feature developments. (There is an
>> example in https://issues.apache.org/jira/browse/SPARK-27371 , where we
>> designed a complex approach to coordinate resource requirements from
>> different workers launched on the same host).
>> >
>> > The proposal is to update the document to deprecate the support of
>> system environment `SPARK_WORKER_INSTANCES` in Spark 3.0, and remove the
>> support in the next major version (Spark 3.1).
>> >
>> > Please kindly let me know if you have use cases relying on this feature.
>> >
>> > Thanks!
>> >
>> > Xingbo
>>
>