You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shivaram Venkataraman (JIRA)" <ji...@apache.org> on 2016/07/18 16:48:20 UTC

[jira] [Updated] (SPARK-16055) sparkR.init() can not load sparkPackages when executing an R file

     [ https://issues.apache.org/jira/browse/SPARK-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shivaram Venkataraman updated SPARK-16055:
------------------------------------------
    Assignee: Krishna Kalyan

> sparkR.init() can not load sparkPackages when executing an R file
> -----------------------------------------------------------------
>
>                 Key: SPARK-16055
>                 URL: https://issues.apache.org/jira/browse/SPARK-16055
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: SparkR
>    Affects Versions: 1.6.1
>            Reporter: Sun Rui
>            Assignee: Krishna Kalyan
>            Priority: Minor
>             Fix For: 2.0.1, 2.1.0
>
>
> This is an issue reported in the Spark user mailing list. Refer to http://comments.gmane.org/gmane.comp.lang.scala.spark.user/35742
> This issue does not occur in an interactive SparkR session, while it does occur when executing an R file.
> The following example code can be put into an R file to reproduce this issue:
> {code}
> .libPaths(c("/home/user/spark-1.6.1-bin-hadoop2.6/R/lib",.libPaths()))
> Sys.setenv(SPARK_HOME="/home/user/spark-1.6.1-bin-hadoop2.6")
> library("SparkR")
> sc <- sparkR.init(sparkPackages = "com.databricks:spark-csv_2.11:1.4.0")
> sqlContext <- sparkRSQL.init(sc)
> df <- read.df(sqlContext, "file:///home/user/spark-1.6.1-bin-hadoop2.6/data/mllib/sample_tree_data.csv","csv")
> showDF(df)
> {code}
> The error message is as such:
> {panel}
> 16/06/19 15:48:56 ERROR RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   java.lang.ClassNotFoundException: Failed to find data source: csv. Please find packages at http://spark-packages.org
> 	at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
> 	at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> 	at org.apache.spark.sql.api.r.SQLUtils$.loadDF(SQLUtils.scala:160)
> 	at org.apache.spark.sql.api.r.SQLUtils.loadDF(SQLUtils.scala)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
> 	at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala
> Calls: read.df -> callJStatic -> invokeJava
> Execution halted
> {panel}
> The reason behind this is that in case you execute an R file, the R backend launches before the R interpreter, so there is no opportunity for packages specified with ‘sparkPackages’ to be processed.
> This JIRA issue is to track this issue. An appropriate solution is to be discussed. Maybe documentation the limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org