You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "shanyu zhao (Jira)" <ji...@apache.org> on 2020/03/04 19:00:00 UTC

[jira] [Updated] (SPARK-31029) Occasional class not found error in user's Future code using global ExecutionContext

     [ https://issues.apache.org/jira/browse/SPARK-31029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

shanyu zhao updated SPARK-31029:
--------------------------------
    Description: 
*Problem:*
When running tpc-ds test (https://github.com/databricks/spark-sql-perf), occasionally we see error related to class not found:

2020-02-04 20:00:26,673 ERROR yarn.ApplicationMaster: User class threw exception: scala.ScalaReflectionException: class com.databricks.spark.sql.perf.ExperimentRun in JavaMirror with 
sun.misc.Launcher$AppClassLoader@28ba21f3 of type class sun.misc.Launcher$AppClassLoader with classpath [...] 
and parent being sun.misc.Launcher$ExtClassLoader@3ff5d147 of type class sun.misc.Launcher$ExtClassLoader with classpath [...] 
and parent being primordial classloader with boot classpath [...] not found.

*Root cause:*
Spark driver starts ApplicationMaster in the main thread, which starts a user thread and set MutableURLClassLoader to that thread's ContextClassLoader.
	userClassThread = startUserApplication()

The main thread then setup YarnSchedulerBackend RPC endpoints, which handles these calls using scala Future with the default global ExecutionContext:
    - doRequestTotalExecutors
    - doKillExecutors

If main thread starts a future to handle doKillExecutors() before user thread does then the default thread pool thread's ContextClassLoader would be the default (AppClassLoader). 
If user thread starts a future first then the thread pool thread will have MutableURLClassLoader.

So if user's code uses a future which references a user provided class (only MutableURLClassLoader can load), and before the future if there are executor lost, you will see errors related to class not found.

*Proposed Solution:*
We can potentially solve this problem in one of two ways:
1) Set the same class loader (userClassLoader) to both the main thread and user thread in ApplicationMaster.scala

2) Do not use "ExecutionContext.Implicits.global" in YarnSchedulerBackend

  was:
*Problem:*
When running tpc-ds test (https://github.com/databricks/spark-sql-perf), occasionally we see error related to class not found:

2020-02-04 20:00:26,673 ERROR yarn.ApplicationMaster: User class threw exception: scala.ScalaReflectionException: class com.databricks.spark.sql.perf.ExperimentRun in JavaMirror with 
sun.misc.Launcher$AppClassLoader@28ba21f3 of type class sun.misc.Launcher$AppClassLoader with classpath [...] 
and parent being sun.misc.Launcher$ExtClassLoader@3ff5d147 of type class sun.misc.Launcher$ExtClassLoader with classpath [...] 
and parent being primordial classloader with boot classpath [...] not found.

*Root cause:*
Spark driver starts ApplicationMaster in the main thread, which starts a user thread and set MutableURLClassLoader to that thread's ContextClassLoader.
	userClassThread = startUserApplication()

The main thread then setup YarnSchedulerBackend RPC endpoints, which handles these calls using scala Future with the default global ExecutionContext:
    - doRequestTotalExecutors
    - doKillExecutors

If main thread starts a future to handle doKillExecutors() before user thread does then the default thread pool thread's ContextClassLoader would be the default (AppClassLoader). 
If user thread starts a future first then the thread pool thread will have MutableURLClassLoader.

So if user's code uses a future which references a user provided class (only MutableURLClassLoader can load), and before the future if there are executor lost, you will see errors related to class not found.

*Proposed Solution:*
Set the same class loader (userClassLoader) to both the main thread and user thread in ApplicationMaster.scala


> Occasional class not found error in user's Future code using global ExecutionContext
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-31029
>                 URL: https://issues.apache.org/jira/browse/SPARK-31029
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 2.4.5
>            Reporter: shanyu zhao
>            Priority: Major
>
> *Problem:*
> When running tpc-ds test (https://github.com/databricks/spark-sql-perf), occasionally we see error related to class not found:
> 2020-02-04 20:00:26,673 ERROR yarn.ApplicationMaster: User class threw exception: scala.ScalaReflectionException: class com.databricks.spark.sql.perf.ExperimentRun in JavaMirror with 
> sun.misc.Launcher$AppClassLoader@28ba21f3 of type class sun.misc.Launcher$AppClassLoader with classpath [...] 
> and parent being sun.misc.Launcher$ExtClassLoader@3ff5d147 of type class sun.misc.Launcher$ExtClassLoader with classpath [...] 
> and parent being primordial classloader with boot classpath [...] not found.
> *Root cause:*
> Spark driver starts ApplicationMaster in the main thread, which starts a user thread and set MutableURLClassLoader to that thread's ContextClassLoader.
> 	userClassThread = startUserApplication()
> The main thread then setup YarnSchedulerBackend RPC endpoints, which handles these calls using scala Future with the default global ExecutionContext:
>     - doRequestTotalExecutors
>     - doKillExecutors
> If main thread starts a future to handle doKillExecutors() before user thread does then the default thread pool thread's ContextClassLoader would be the default (AppClassLoader). 
> If user thread starts a future first then the thread pool thread will have MutableURLClassLoader.
> So if user's code uses a future which references a user provided class (only MutableURLClassLoader can load), and before the future if there are executor lost, you will see errors related to class not found.
> *Proposed Solution:*
> We can potentially solve this problem in one of two ways:
> 1) Set the same class loader (userClassLoader) to both the main thread and user thread in ApplicationMaster.scala
> 2) Do not use "ExecutionContext.Implicits.global" in YarnSchedulerBackend



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org