You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by myasuka <my...@live.com> on 2014/09/23 07:58:29 UTC

Why recommend 2-3 tasks per CPU core ?

We are now implementing a matrix multiplication algorithm on Spark, which was
designed in the traditional MPI working way before. It assumes every core in
the grid computes in parallel. 

Now in our develop environment, each executor node has 16 cores, and I
assign 16 tasks to each executor node to hope every core do once submatrix
multiplication. But by checking the log and the monitor web ui, I find some
task do once submatrix multiplication, while some do twice, some never do.
This is not what I expect to let every core do once multiplication.

Is there any way to increase the Concurrence?

Moreover, when I decrease the value *--total-executor-cores* to let every
executor has less working cores, 16 tasks on per node will not launch
simultaneously. In the official Tuning Spark doc: / In general, we recommend
2-3 tasks per CPU core in your cluster. / Thus I want to know why  recommend
2-3 tasks per CPU core?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Why-recommend-2-3-tasks-per-CPU-core-tp14869.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Why recommend 2-3 tasks per CPU core ?

Posted by myasuka <my...@live.com>.
Thank for your reply!  Specifically, in our developing environment, I want to
know is there any solution to let every tak do just once sub-matrices
multiplication?

After 'groupBy', every node with 16 cores should get 16 pair matrices, thus
I set every node with 16 tasks to hope every core do once matrix
multiplication, however some task do once, some task do none while some do
twice. The performance bottlelneck is that not every core do the same amount
of work, the one which does more would need more time.

Our algorithm evolves from a MPI version which assume every core in the grid
do the same amount of work.


Andrew Ash wrote
> Also you'd rather have 2-3 tasks per core than 1 task per core because if
> the 1 task per core is actually 1.01 tasks per core, then you have one
> wave
> of tasks complete and another wave of tasks with very few tasks in them.
> You get better utilization when you're higher than 1.
> 
> Aaron Davidson goes into this more somewhere in this talk --
> https://www.youtube.com/watch?v=dmL0N3qfSc8
> 
> On Mon, Sep 22, 2014 at 11:52 PM, Nicholas Chammas <

> nicholas.chammas@

>> wrote:
> 
>> On Tue, Sep 23, 2014 at 1:58 AM, myasuka &lt;

> myasuka@

> &gt; wrote:
>>
>>> Thus I want to know why  recommend
>>> 2-3 tasks per CPU core?
>>>
>>
>> You want at least 1 task per core so that you fully utilize the cluster's
>> parallelism.
>>
>> You want 2-3 tasks per core so that tasks are a bit smaller than they
>> would otherwise be, making them shorter and more likely to complete
>> successfully.
>>
>> Nick
>>





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Why-recommend-2-3-tasks-per-CPU-core-tp14869p15006.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Why recommend 2-3 tasks per CPU core ?

Posted by Andrew Ash <an...@andrewash.com>.
Also you'd rather have 2-3 tasks per core than 1 task per core because if
the 1 task per core is actually 1.01 tasks per core, then you have one wave
of tasks complete and another wave of tasks with very few tasks in them.
You get better utilization when you're higher than 1.

Aaron Davidson goes into this more somewhere in this talk --
https://www.youtube.com/watch?v=dmL0N3qfSc8

On Mon, Sep 22, 2014 at 11:52 PM, Nicholas Chammas <
nicholas.chammas@gmail.com> wrote:

> On Tue, Sep 23, 2014 at 1:58 AM, myasuka <my...@live.com> wrote:
>
>> Thus I want to know why  recommend
>> 2-3 tasks per CPU core?
>>
>
> You want at least 1 task per core so that you fully utilize the cluster's
> parallelism.
>
> You want 2-3 tasks per core so that tasks are a bit smaller than they
> would otherwise be, making them shorter and more likely to complete
> successfully.
>
> Nick
>

Re: Why recommend 2-3 tasks per CPU core ?

Posted by Nicholas Chammas <ni...@gmail.com>.
On Tue, Sep 23, 2014 at 1:58 AM, myasuka <my...@live.com> wrote:

> Thus I want to know why  recommend
> 2-3 tasks per CPU core?
>

You want at least 1 task per core so that you fully utilize the cluster's
parallelism.

You want 2-3 tasks per core so that tasks are a bit smaller than they would
otherwise be, making them shorter and more likely to complete successfully.

Nick