You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/01/13 12:56:26 UTC

[jira] [Commented] (SPARK-19213) FileSourceScanExec usese sparksession from hadoopfsrelation creation time instead of the one active at time of execution

    [ https://issues.apache.org/jira/browse/SPARK-19213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821736#comment-15821736 ] 

Apache Spark commented on SPARK-19213:
--------------------------------------

User 'robert3005' has created a pull request for this issue:
https://github.com/apache/spark/pull/16575

> FileSourceScanExec usese sparksession from hadoopfsrelation creation time instead of the one active at time of execution
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19213
>                 URL: https://issues.apache.org/jira/browse/SPARK-19213
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Robert Kruszewski
>
> If you look at https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L260 you'll notice that the sparksession used for execution is the one that was captured from logicalplan. Whereas in other places you have https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L154 and SparkPlan captures active session upon execution in https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala#L52
> From my understanding of the io code it would be beneficial to be able to use the active session in order to be able to modify hadoop config without recreating the dataset. What would be interesting is to not lock the spark session in the physical plan for ios and let you share datasets across spark sessions. Is that supposed to work? Otherwise you'd have to get a new query execution to bind to new sparksession which would only let you share logical plans. 
> I am sending pr along with the latter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org