You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashic Mahtab <as...@live.com> on 2014/12/22 11:39:04 UTC

Using more cores on machines

Hi,
Say we have 4 nodes with 2 cores each in stand alone mode. I'd like to dedicate 4 cores to a streaming application. I can do this via spark submit by:

spark-submit .... --total-executor-cores 4

However, this assigns one core per machine. I would like to use 2 cores on 2 machines instead, leaving the other two machines untouched. Is this possible? Is there a downside to doing this? My thinking is that I should be able to reduce quite a bit of network traffic if all machines are not involved.


Thanks,
Ashic.
 		 	   		  

Re: Using more cores on machines

Posted by Boromir Widas <vc...@gmail.com>.
If you are looking to reduce network traffic then setting
spark.deploy.spreadOut
to false may help.

On Mon, Dec 22, 2014 at 11:44 AM, Ashic Mahtab <as...@live.com> wrote:
>
> Hi Josh,
> I'm not looking to change the 1:1 ratio.
>
> What I'm trying to do is get both cores on two machines working, rather
> than one core on all four machines. With --total-executor-cores 4, I have 1
> core per machine working for an app. I'm looking for something that'll let
> me use 2 cores per machine on 2 machines (so 4 cores in total) while not
> using the other two machines.
>
> Regards,
> Ashic.
>
> > From: josh@soundcloud.com
> > Date: Mon, 22 Dec 2014 17:36:26 +0100
> > Subject: Re: Using more cores on machines
> > To: ashic@live.com
> > CC: sowen@cloudera.com; user@spark.apache.org
>
> >
> > AFAIK, `--num-executors` is not available for standalone clusters. In
> > standalone mode, you must start new workers on your node as it is a
> > 1:1 ratio of workers to executors.
> >
> >
> > On 22 December 2014 at 12:25, Ashic Mahtab <as...@live.com> wrote:
> > > Hi Sean,
> > > Thanks for the response.
> > >
> > > It seems --num-executors is ignored. Specifying --num-executors 2
> > > --executor-cores 2 is giving the app all 8 cores across 4 machines.
> > >
> > > -Ashic.
> > >
> > >> From: sowen@cloudera.com
> > >> Date: Mon, 22 Dec 2014 10:57:31 +0000
> > >> Subject: Re: Using more cores on machines
> > >> To: ashic@live.com
> > >> CC: user@spark.apache.org
> > >
> > >>
> > >> I think you want:
> > >>
> > >> --num-executors 2 --executor-cores 2
> > >>
> > >> On Mon, Dec 22, 2014 at 10:39 AM, Ashic Mahtab <as...@live.com>
> wrote:
> > >> > Hi,
> > >> > Say we have 4 nodes with 2 cores each in stand alone mode. I'd like
> to
> > >> > dedicate 4 cores to a streaming application. I can do this via spark
> > >> > submit
> > >> > by:
> > >> >
> > >> > spark-submit .... --total-executor-cores 4
> > >> >
> > >> > However, this assigns one core per machine. I would like to use 2
> cores
> > >> > on 2
> > >> > machines instead, leaving the other two machines untouched. Is this
> > >> > possible? Is there a downside to doing this? My thinking is that I
> > >> > should be
> > >> > able to reduce quite a bit of network traffic if all machines are
> not
> > >> > involved.
> > >> >
> > >> >
> > >> > Thanks,
> > >> > Ashic.
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > >> For additional commands, e-mail: user-help@spark.apache.org
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
>

RE: Using more cores on machines

Posted by Ashic Mahtab <as...@live.com>.
Hi Josh,
I'm not looking to change the 1:1 ratio.

What I'm trying to do is get both cores on two machines working, rather than one core on all four machines. With --total-executor-cores 4, I have 1 core per machine working for an app. I'm looking for something that'll let me use 2 cores per machine on 2 machines (so 4 cores in total) while not using the other two machines.

Regards,
Ashic.

> From: josh@soundcloud.com
> Date: Mon, 22 Dec 2014 17:36:26 +0100
> Subject: Re: Using more cores on machines
> To: ashic@live.com
> CC: sowen@cloudera.com; user@spark.apache.org
> 
> AFAIK, `--num-executors` is not available for standalone clusters. In
> standalone mode, you must start new workers on your node as it is a
> 1:1 ratio of workers to executors.
> 
> 
> On 22 December 2014 at 12:25, Ashic Mahtab <as...@live.com> wrote:
> > Hi Sean,
> > Thanks for the response.
> >
> > It seems --num-executors is ignored. Specifying --num-executors 2
> > --executor-cores 2 is giving the app all 8 cores across 4 machines.
> >
> > -Ashic.
> >
> >> From: sowen@cloudera.com
> >> Date: Mon, 22 Dec 2014 10:57:31 +0000
> >> Subject: Re: Using more cores on machines
> >> To: ashic@live.com
> >> CC: user@spark.apache.org
> >
> >>
> >> I think you want:
> >>
> >> --num-executors 2 --executor-cores 2
> >>
> >> On Mon, Dec 22, 2014 at 10:39 AM, Ashic Mahtab <as...@live.com> wrote:
> >> > Hi,
> >> > Say we have 4 nodes with 2 cores each in stand alone mode. I'd like to
> >> > dedicate 4 cores to a streaming application. I can do this via spark
> >> > submit
> >> > by:
> >> >
> >> > spark-submit .... --total-executor-cores 4
> >> >
> >> > However, this assigns one core per machine. I would like to use 2 cores
> >> > on 2
> >> > machines instead, leaving the other two machines untouched. Is this
> >> > possible? Is there a downside to doing this? My thinking is that I
> >> > should be
> >> > able to reduce quite a bit of network traffic if all machines are not
> >> > involved.
> >> >
> >> >
> >> > Thanks,
> >> > Ashic.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> >> For additional commands, e-mail: user-help@spark.apache.org
> >>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
> 
 		 	   		  

Re: Using more cores on machines

Posted by Josh Devins <jo...@soundcloud.com>.
AFAIK, `--num-executors` is not available for standalone clusters. In
standalone mode, you must start new workers on your node as it is a
1:1 ratio of workers to executors.


On 22 December 2014 at 12:25, Ashic Mahtab <as...@live.com> wrote:
> Hi Sean,
> Thanks for the response.
>
> It seems --num-executors is ignored. Specifying --num-executors 2
> --executor-cores 2 is giving the app all 8 cores across 4 machines.
>
> -Ashic.
>
>> From: sowen@cloudera.com
>> Date: Mon, 22 Dec 2014 10:57:31 +0000
>> Subject: Re: Using more cores on machines
>> To: ashic@live.com
>> CC: user@spark.apache.org
>
>>
>> I think you want:
>>
>> --num-executors 2 --executor-cores 2
>>
>> On Mon, Dec 22, 2014 at 10:39 AM, Ashic Mahtab <as...@live.com> wrote:
>> > Hi,
>> > Say we have 4 nodes with 2 cores each in stand alone mode. I'd like to
>> > dedicate 4 cores to a streaming application. I can do this via spark
>> > submit
>> > by:
>> >
>> > spark-submit .... --total-executor-cores 4
>> >
>> > However, this assigns one core per machine. I would like to use 2 cores
>> > on 2
>> > machines instead, leaving the other two machines untouched. Is this
>> > possible? Is there a downside to doing this? My thinking is that I
>> > should be
>> > able to reduce quite a bit of network traffic if all machines are not
>> > involved.
>> >
>> >
>> > Thanks,
>> > Ashic.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


RE: Using more cores on machines

Posted by Ashic Mahtab <as...@live.com>.
Hi Sean,
Thanks for the response. 

It seems --num-executors is ignored. Specifying --num-executors 2 --executor-cores 2 is giving the app all 8 cores across 4 machines.

-Ashic.

> From: sowen@cloudera.com
> Date: Mon, 22 Dec 2014 10:57:31 +0000
> Subject: Re: Using more cores on machines
> To: ashic@live.com
> CC: user@spark.apache.org
> 
> I think you want:
> 
> --num-executors 2 --executor-cores 2
> 
> On Mon, Dec 22, 2014 at 10:39 AM, Ashic Mahtab <as...@live.com> wrote:
> > Hi,
> > Say we have 4 nodes with 2 cores each in stand alone mode. I'd like to
> > dedicate 4 cores to a streaming application. I can do this via spark submit
> > by:
> >
> > spark-submit .... --total-executor-cores 4
> >
> > However, this assigns one core per machine. I would like to use 2 cores on 2
> > machines instead, leaving the other two machines untouched. Is this
> > possible? Is there a downside to doing this? My thinking is that I should be
> > able to reduce quite a bit of network traffic if all machines are not
> > involved.
> >
> >
> > Thanks,
> > Ashic.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
> 
 		 	   		  

Re: Using more cores on machines

Posted by Sean Owen <so...@cloudera.com>.
I think you want:

--num-executors 2 --executor-cores 2

On Mon, Dec 22, 2014 at 10:39 AM, Ashic Mahtab <as...@live.com> wrote:
> Hi,
> Say we have 4 nodes with 2 cores each in stand alone mode. I'd like to
> dedicate 4 cores to a streaming application. I can do this via spark submit
> by:
>
> spark-submit .... --total-executor-cores 4
>
> However, this assigns one core per machine. I would like to use 2 cores on 2
> machines instead, leaving the other two machines untouched. Is this
> possible? Is there a downside to doing this? My thinking is that I should be
> able to reduce quite a bit of network traffic if all machines are not
> involved.
>
>
> Thanks,
> Ashic.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org