You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by jelmer <jk...@gmail.com> on 2019/12/08 09:15:29 UTC

Request more yarn vcores than executors

I have a job, running on yarn, that uses multithreading inside of a
mapPartitions transformation

Ideally I would like to have a small number of partitions but have a high
number of yarn vcores allocated to the task (that i can take advantage of
because of multi threading)

Is this possible?

I tried running with  : --executor-cores 1 --conf
spark.yarn.executor.cores=20
But it seems spark.yarn.executor.cores gets ignored

Re: Request more yarn vcores than executors

Posted by Chris Teoh <ch...@gmail.com>.
If that is the case, perhaps set vcore to CPU core ratio as 1:1 and just do
--executor-cores 1 and that would at least try to get you more threads per
executor. Note that vcore is a logical construct and isn't directly related
to CPU cores, just the time slice allowed over the entire set of CPUs on
each server.

I've seen multi threading at the driver where there might be multiple jobs
being run if they're working on unevenly distributed workloads which more
efficiently leverage the executors. Perhaps that is something to consider.

On Sun, 8 Dec 2019, 8:29 pm jelmer, <jk...@gmail.com> wrote:

> you can take on more simultaneous tasks per executor
>
>
> That is exactly what I want to avoid. that nature of the task makes it
> difficult to parallelise over many partitions. Ideally i'd have 1 executor
> per task with 10+ cores assigned to each executor
>
> On Sun, 8 Dec 2019 at 10:23, Chris Teoh <ch...@gmail.com> wrote:
>
>> I thought --executor-cores is the same the other argument. If anything,
>> just set --executor-cores to something greater than 1 and don't set the
>> other one you mentioned. You'll then get greater number of cores per
>> executor so you can take on more simultaneous tasks per executor.
>>
>> On Sun, 8 Dec 2019, 8:16 pm jelmer, <jk...@gmail.com> wrote:
>>
>>> I have a job, running on yarn, that uses multithreading inside of a
>>> mapPartitions transformation
>>>
>>> Ideally I would like to have a small number of partitions but have a
>>> high number of yarn vcores allocated to the task (that i can take advantage
>>> of because of multi threading)
>>>
>>> Is this possible?
>>>
>>> I tried running with  : --executor-cores 1 --conf
>>> spark.yarn.executor.cores=20
>>> But it seems spark.yarn.executor.cores gets ignored
>>>
>>

Re: Request more yarn vcores than executors

Posted by jelmer <jk...@gmail.com>.
>
> you can take on more simultaneous tasks per executor


That is exactly what I want to avoid. that nature of the task makes it
difficult to parallelise over many partitions. Ideally i'd have 1 executor
per task with 10+ cores assigned to each executor

On Sun, 8 Dec 2019 at 10:23, Chris Teoh <ch...@gmail.com> wrote:

> I thought --executor-cores is the same the other argument. If anything,
> just set --executor-cores to something greater than 1 and don't set the
> other one you mentioned. You'll then get greater number of cores per
> executor so you can take on more simultaneous tasks per executor.
>
> On Sun, 8 Dec 2019, 8:16 pm jelmer, <jk...@gmail.com> wrote:
>
>> I have a job, running on yarn, that uses multithreading inside of a
>> mapPartitions transformation
>>
>> Ideally I would like to have a small number of partitions but have a high
>> number of yarn vcores allocated to the task (that i can take advantage of
>> because of multi threading)
>>
>> Is this possible?
>>
>> I tried running with  : --executor-cores 1 --conf
>> spark.yarn.executor.cores=20
>> But it seems spark.yarn.executor.cores gets ignored
>>
>

Re: Request more yarn vcores than executors

Posted by Chris Teoh <ch...@gmail.com>.
I thought --executor-cores is the same the other argument. If anything,
just set --executor-cores to something greater than 1 and don't set the
other one you mentioned. You'll then get greater number of cores per
executor so you can take on more simultaneous tasks per executor.

On Sun, 8 Dec 2019, 8:16 pm jelmer, <jk...@gmail.com> wrote:

> I have a job, running on yarn, that uses multithreading inside of a
> mapPartitions transformation
>
> Ideally I would like to have a small number of partitions but have a high
> number of yarn vcores allocated to the task (that i can take advantage of
> because of multi threading)
>
> Is this possible?
>
> I tried running with  : --executor-cores 1 --conf
> spark.yarn.executor.cores=20
> But it seems spark.yarn.executor.cores gets ignored
>