You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/10/20 17:52:00 UTC

[jira] [Updated] (SPARK-45616) Usages of ParVector are unsafe because it does not propagate ThreadLocals or SparkSession

     [ https://issues.apache.org/jira/browse/SPARK-45616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated SPARK-45616:
-----------------------------------
    Labels: pull-request-available  (was: )

> Usages of ParVector are unsafe because it does not propagate ThreadLocals or SparkSession
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-45616
>                 URL: https://issues.apache.org/jira/browse/SPARK-45616
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL, Tests
>    Affects Versions: 3.5.0
>            Reporter: Ankur Dave
>            Assignee: Ankur Dave
>            Priority: Minor
>              Labels: pull-request-available
>
> CastSuiteBase and ExpressionInfoSuite use ParVector.foreach() to run Spark SQL queries in parallel. They incorrectly assume that each parallel operation will inherit the main thread’s active SparkSession. This is only true when these parallel operations run in freshly-created threads. However, when other code has already run some parallel operations before Spark was started, then there may be existing threads that do not have an active SparkSession. In that case, these tests fail with NullPointerExceptions when creating SparkPlans or running SQL queries.
> The fix is to use the existing method ThreadUtils.parmap(). This method creates fresh threads that inherit the current active SparkSession, and it propagates the Spark ThreadLocals.
> We should also add a scalastyle warning against use of ParVector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org