You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/03/07 02:25:46 UTC

[GitHub] [spark] shanyu opened a new pull request #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend

shanyu opened a new pull request #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend
URL: https://github.com/apache/spark/pull/27843
 
 
   #31029 # What changes were proposed in this pull request?
   In YarnSchedulerBackend, we should avoid using the global execution context for its Future. Otherwise if user's Spark application also uses global execution context for its Future, the user is facing indeterministic behavior in terms of the thread's context class loader.
   
   ### Why are the changes needed?
   Spark driver starts ApplicationMaster in the main thread, which starts a user thread and set MutableURLClassLoader to that thread's ContextClassLoader.
   userClassThread = startUserApplication()
   
   The main thread then setup YarnSchedulerBackend RPC endpoints, which handles these calls using scala Future with the default global ExecutionContext:
   
   doRequestTotalExecutors
   doKillExecutors
   If main thread starts a future to handle doKillExecutors() before user thread does then the default thread pool thread's ContextClassLoader would be the default (AppClassLoader).
   If user thread starts a future first then the thread pool thread will have MutableURLClassLoader.
   
   So if user's code uses a future which references a user provided class (only MutableURLClassLoader can load), and before the future if there are executor lost, you will see errors related to class not found.
   
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   Existing unit tests and manual tests

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend
URL: https://github.com/apache/spark/pull/27843#issuecomment-596036064
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend
URL: https://github.com/apache/spark/pull/27843#issuecomment-596035955
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend
URL: https://github.com/apache/spark/pull/27843#issuecomment-596035955
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org