You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2018/12/25 13:48:03 UTC
[GitHub] Ngone51 opened a new pull request #23377: [SPARK-26439][CORE][WIP]
Introduce WorkerOffer reservation mechanism for Barrier TaskSet
Ngone51 opened a new pull request #23377: [SPARK-26439][CORE][WIP] Introduce WorkerOffer reservation mechanism for Barrier TaskSet
URL: https://github.com/apache/spark/pull/23377
## What changes were proposed in this pull request?
Currently, Barrier TaskSet has a hard requirement that tasks can only be launched
in a single resourceOffers round with enough slots(or sufficient resources), but
can not be guaranteed even if with enough slots due to task locality delay scheduling.
So, it is very likely that Barrier TaskSet gets a chunk of sufficient resources after
all the trouble, but let it go easily just beacuae one of pending tasks can not be
scheduled. Futhermore, it causes severe resource competition between TaskSets and jobs
and introduce unclear semantic for DynamicAllocation.
This pr trys to introduce WorkOffer reservation mechanism for Barrier TaskSet, which
allows Barrier TaskSet to reserve WorkOffer in each resourceOffers round, and launch
tasks at the same time once it accumulate the sufficient resource. In this way, we
relax the requirement of resources for the Barrier TaskSet. To avoid the deadlock which
may be introuduced by serveral Barrier TaskSets holding the reserved WorkOffer for a
long time, we'll ask Barrier TaskSets to force releasing part of reserved WorkOffers
on demand. So, it is highly possible that each Barrier TaskSet would be launched in the
end.
To integrate with DynamicAllocation:
The possible effective way I can imagine is that adding new event, e.g.
ExecutorReservedEvent, ExecutorReleasedEvent, which behaved like busy executor with
running tasks or idle executor without running tasks. Thus, ExecutionAllocationManager
would not let the executor go if it reminds of there're some reserved resource on that
executor.
## How was this patch tested?
existed and added one yet, needs to add more.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org