You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Jacek Laskowski <ja...@japila.pl> on 2017/01/26 11:48:57 UTC

Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

Hi,

Why are there two (almost) identical makeOffers in
CoarseGrainedSchedulerBackend [1] and [2]? I can't seem to figure out
why they are there and am leaning towards considering one a duplicate.

WDYT?

[1] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L211

[2] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L229

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

Posted by Jacek Laskowski <ja...@japila.pl>.

Hi Imran,

Ok, that makes sense for performance reasons. Thanks for bearing with
me and explaining that code with so much patience. Appreciated!

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Thu, Jan 26, 2017 at 11:00 PM, Imran Rashid <ir...@cloudera.com> wrote:
> it is a small difference but think about what this means with a cluster
> where you have 10k tasks (perhaps 1k executors with 10 cores each).
>
> When you have one task complete, you have to go through 1k more executors.
>
> On top of that, with a large cluster, task completions happen far more
> frequently, since each core in your cluster is finishing tasks
> independently, and sending those updates back to the driver -- eg., you
> expect to get 10k updates from one "wave" of tasks on your cluster.  So you
> avoid going through a list of 1k executors 10k times in just one wave of
> tasks.
>
> On Thu, Jan 26, 2017 at 9:12 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>> Hi Imran,
>>
>> Thanks a lot for your detailed explanation, but IMHO the difference is
>> so small that I'm surprised it merits two versions -- both check
>> whether an executor is alive -- executorIsAlive(executorId) vs
>> executorDataMap.filterKeys(executorIsAlive) A bit fishy, isn't it?
>>
>> But, on the other hand, since no one has considered it a small
>> duplication it could be perfectly fine (it did make the code a bit
>> less obvious to me).
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Thu, Jan 26, 2017 at 3:43 PM, Imran Rashid <ir...@cloudera.com>
>> wrote:
>> > one is used when exactly one task has finished -- that means you now
>> > have
>> > free resources on just that one executor, so you only need to look for
>> > something to schedule on that one.
>> >
>> > the other one is used when you want to schedule everything you can
>> > across
>> > the entire cluster.  For example, you have just submitted a new taskset,
>> > so
>> > you want to try to use any idle resources across the entire cluster.
>> > Or,
>> > for delay scheduling, you periodically retry all idle resources, in case
>> > they locality delay has expired.
>> >
>> > you could eliminate the version which takes an executorId, and always
>> > make
>> > offers across all idle hosts -- it would still be correct.  Its a small
>> > efficiency improvement to avoid having to go through the list of all
>> > resources.
>> >
>> > On Thu, Jan 26, 2017 at 5:48 AM, Jacek Laskowski <ja...@japila.pl>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> Why are there two (almost) identical makeOffers in
>> >> CoarseGrainedSchedulerBackend [1] and [2]? I can't seem to figure out
>> >> why they are there and am leaning towards considering one a duplicate.
>> >>
>> >> WDYT?
>> >>
>> >> [1]
>> >>
>> >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L211
>> >>
>> >> [2]
>> >>
>> >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L229
>> >>
>> >> Pozdrawiam,
>> >> Jacek Laskowski
>> >> ----
>> >> https://medium.com/@jaceklaskowski/
>> >> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
>> >> Follow me at https://twitter.com/jaceklaskowski
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>> >>
>> >
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

Posted by Imran Rashid <ir...@cloudera.com>.

it is a small difference but think about what this means with a cluster
where you have 10k tasks (perhaps 1k executors with 10 cores each).

When you have one task complete, you have to go through 1k more executors.

On top of that, with a large cluster, task completions happen far more
frequently, since each core in your cluster is finishing tasks
independently, and sending those updates back to the driver -- eg., you
expect to get 10k updates from one "wave" of tasks on your cluster.  So you
avoid going through a list of 1k executors 10k times in just one wave of
tasks.

On Thu, Jan 26, 2017 at 9:12 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi Imran,
>
> Thanks a lot for your detailed explanation, but IMHO the difference is
> so small that I'm surprised it merits two versions -- both check
> whether an executor is alive -- executorIsAlive(executorId) vs
> executorDataMap.filterKeys(executorIsAlive) A bit fishy, isn't it?
>
> But, on the other hand, since no one has considered it a small
> duplication it could be perfectly fine (it did make the code a bit
> less obvious to me).
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Thu, Jan 26, 2017 at 3:43 PM, Imran Rashid <ir...@cloudera.com>
> wrote:
> > one is used when exactly one task has finished -- that means you now have
> > free resources on just that one executor, so you only need to look for
> > something to schedule on that one.
> >
> > the other one is used when you want to schedule everything you can across
> > the entire cluster.  For example, you have just submitted a new taskset,
> so
> > you want to try to use any idle resources across the entire cluster.  Or,
> > for delay scheduling, you periodically retry all idle resources, in case
> > they locality delay has expired.
> >
> > you could eliminate the version which takes an executorId, and always
> make
> > offers across all idle hosts -- it would still be correct.  Its a small
> > efficiency improvement to avoid having to go through the list of all
> > resources.
> >
> > On Thu, Jan 26, 2017 at 5:48 AM, Jacek Laskowski <ja...@japila.pl>
> wrote:
> >>
> >> Hi,
> >>
> >> Why are there two (almost) identical makeOffers in
> >> CoarseGrainedSchedulerBackend [1] and [2]? I can't seem to figure out
> >> why they are there and am leaning towards considering one a duplicate.
> >>
> >> WDYT?
> >>
> >> [1]
> >> https://github.com/apache/spark/blob/master/core/src/
> main/scala/org/apache/spark/scheduler/cluster/
> CoarseGrainedSchedulerBackend.scala#L211
> >>
> >> [2]
> >> https://github.com/apache/spark/blob/master/core/src/
> main/scala/org/apache/spark/scheduler/cluster/
> CoarseGrainedSchedulerBackend.scala#L229
> >>
> >> Pozdrawiam,
> >> Jacek Laskowski
> >> ----
> >> https://medium.com/@jaceklaskowski/
> >> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
> >> Follow me at https://twitter.com/jaceklaskowski
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> >>
> >
>

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

Posted by Jacek Laskowski <ja...@japila.pl>.

Hi Imran,

Thanks a lot for your detailed explanation, but IMHO the difference is
so small that I'm surprised it merits two versions -- both check
whether an executor is alive -- executorIsAlive(executorId) vs
executorDataMap.filterKeys(executorIsAlive) A bit fishy, isn't it?

But, on the other hand, since no one has considered it a small
duplication it could be perfectly fine (it did make the code a bit
less obvious to me).

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Thu, Jan 26, 2017 at 3:43 PM, Imran Rashid <ir...@cloudera.com> wrote:
> one is used when exactly one task has finished -- that means you now have
> free resources on just that one executor, so you only need to look for
> something to schedule on that one.
>
> the other one is used when you want to schedule everything you can across
> the entire cluster.  For example, you have just submitted a new taskset, so
> you want to try to use any idle resources across the entire cluster.  Or,
> for delay scheduling, you periodically retry all idle resources, in case
> they locality delay has expired.
>
> you could eliminate the version which takes an executorId, and always make
> offers across all idle hosts -- it would still be correct.  Its a small
> efficiency improvement to avoid having to go through the list of all
> resources.
>
> On Thu, Jan 26, 2017 at 5:48 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>> Hi,
>>
>> Why are there two (almost) identical makeOffers in
>> CoarseGrainedSchedulerBackend [1] and [2]? I can't seem to figure out
>> why they are there and am leaning towards considering one a duplicate.
>>
>> WDYT?
>>
>> [1]
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L211
>>
>> [2]
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L229
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

Posted by Imran Rashid <ir...@cloudera.com>.

one is used when exactly one task has finished -- that means you now have
free resources on just that one executor, so you only need to look for
something to schedule on that one.

the other one is used when you want to schedule everything you can across
the entire cluster.  For example, you have just submitted a new taskset, so
you want to try to use any idle resources across the entire cluster.  Or,
for delay scheduling, you periodically retry all idle resources, in case
they locality delay has expired.

you could eliminate the version which takes an executorId, and always make
offers across all idle hosts -- it would still be correct.  Its a small
efficiency improvement to avoid having to go through the list of all
resources.

On Thu, Jan 26, 2017 at 5:48 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> Why are there two (almost) identical makeOffers in
> CoarseGrainedSchedulerBackend [1] and [2]? I can't seem to figure out
> why they are there and am leaning towards considering one a duplicate.
>
> WDYT?
>
> [1] https://github.com/apache/spark/blob/master/core/src/
> main/scala/org/apache/spark/scheduler/cluster/
> CoarseGrainedSchedulerBackend.scala#L211
>
> [2] https://github.com/apache/spark/blob/master/core/src/
> main/scala/org/apache/spark/scheduler/cluster/
> CoarseGrainedSchedulerBackend.scala#L229
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>