You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marco Gaido (JIRA)" <ji...@apache.org> on 2019/04/02 14:02:03 UTC

[jira] [Commented] (SPARK-27287) PCAModel.load() does not honor spark configs

    [ https://issues.apache.org/jira/browse/SPARK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807780#comment-16807780 ] 

Marco Gaido commented on SPARK-27287:
-------------------------------------

I think the problem here is that the configuration are copied from the SparkSession to the SparkContext when the SparkSession is used, but they are not if the SparkContext is used (as it is done when loading a model in ML). I think the solutions here may be 2: fix the way contexts are handling properties (preferred); use the sparkSession when reading ML models (easier workaround).

> PCAModel.load() does not honor spark configs
> --------------------------------------------
>
>                 Key: SPARK-27287
>                 URL: https://issues.apache.org/jira/browse/SPARK-27287
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.4.0
>            Reporter: Dharmesh Kakadia
>            Priority: Major
>
> PCAModel.load() does not seem to be using the configurations set on the current spark session. 
> Repro:
>  
> The following will fail to read the data because the storage account credentials config used/propagated. 
> conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==")
> spark = SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate()
> model = PCAModel.load('wasb://test@test.blob.core.windows.net/model')
>  
> The following however works:
> conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==")
> spark = SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate()
> blah = spark.read.json('wasb://test@test.blob.core.windows.net/somethingelse/')
> blah.show()
> model = PCAModel.load('wasb://test@test.blob.core.windows.net/model')
>  
> It looks like spark.read...() does force the use of the config once and then PCAModel.load() will work correctly. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org