You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/11/22 04:03:28 UTC

[GitHub] [spark] maryannxue opened a new pull request #26633: [SPARK-29994] Add WILDCARD task location

maryannxue opened a new pull request #26633: [SPARK-29994] Add WILDCARD task location
URL: https://github.com/apache/spark/pull/26633
 
 
   ### What changes were proposed in this pull request?
   This PR adds a new WILDCARD task location that can match any host. This WILDCARD location can be used together with other regular locations in the list of preferred locations to indicate that the task can be assigned to any host/executor if none of the preferred locations is available.
   
   ### Why are the changes needed?
   This is motivated by the requirement from LocalShuffledRowRDD. When the number of initial mappers of LocalShuffledRowRDD is smaller than the number of worker nodes, it can cause serious regressions if short-running tasks all wait on their preferred locations while they could have otherwise finished quickly on non-preferred locations too.
   
   We have a "locality wait time" configuration that allows a task set to downgrade locality requirement after a certain time has passed. Yet, this configuration affects all task sets in the scheduler, and tasks all differ in penalty of locality miss. Thus, we need this finer-grained option for individual tasks to opt out of locality.
   
   ### Does this PR introduce any user-facing change?
   No.
   
   ### How was this patch tested?
   Added UT.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org