You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/24 22:39:08 UTC

[GitHub] [spark] vanzin opened a new pull request #24704: [SPARK-20286][core] Improve logic for timing out executors in dynamic allocation.

vanzin opened a new pull request #24704: [SPARK-20286][core] Improve logic for timing out executors in dynamic allocation.
URL: https://github.com/apache/spark/pull/24704
 
 
   This change refactors the portions of the ExecutorAllocationManager class that
   track executor state into a new class, to achieve a few goals:
   
   - make the code easier to understand
   - better separate concerns (task backlog vs. executor state)
   - less synchronization between event and allocation threads
   - less coupling between the allocation code and executor state tracking
   
   The executor tracking code was moved to a new class (ExecutorMonitor) that
   encapsulates all the logic of tracking what happens to executors and when
   they can be timed out. The logic to actually remove the executors remains
   in the EAM, since it still requires information that is not tracked by the
   new executor monitor code.
   
   In the executor monitor itself, of interest, specifically, is a change in
   how cached blocks are tracked; instead of polling the block manager, the
   monitor now uses events to track which executors have cached blocks, and
   is able to detect also unpersist events and adjust the time when the executor
   should be removed accordingly. (That's the bug mentioned in the PR title.)
   
   Because of the refactoring, a few tests in the old EAM test suite were removed,
   since they're now covered by the newly added test suite. The EAM suite was
   also changed a little bit to not instantiate a SparkContext every time. This
   allowed some cleanup, and the tests also run faster.
   
   Tested with new and updated unit tests, and with multiple TPC-DS workloads
   running with dynamic allocation on; also some manual tests for the caching
   behavior.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org