You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Krishna Kalyan (JIRA)" <ji...@apache.org> on 2016/07/09 14:54:11 UTC
[jira] [Comment Edited] (SPARK-16055) sparkR.init() can not load
sparkPackages when executing an R file
[ https://issues.apache.org/jira/browse/SPARK-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369116#comment-15369116 ]
Krishna Kalyan edited comment on SPARK-16055 at 7/9/16 2:54 PM:
----------------------------------------------------------------
[~shivaram] Thanks, I can reproduce this issue in my local environment.
here in the documentation its already mentioned (https://spark.apache.org/docs/latest/sparkr.html#from-data-sources) that --packages flag should be used with with spark-submit.
was (Author: krishnakalyan3):
[~shivaram] Thanks, I can reproduce this issue in my local environment.
here in the documentation its already mentioned (https://spark.apache.org/docs/latest/sparkr.html#from-data-sources) that --packages should be used with with spark-submit.
> sparkR.init() can not load sparkPackages when executing an R file
> -----------------------------------------------------------------
>
> Key: SPARK-16055
> URL: https://issues.apache.org/jira/browse/SPARK-16055
> Project: Spark
> Issue Type: Brainstorming
> Components: SparkR
> Affects Versions: 1.6.1
> Reporter: Sun Rui
> Priority: Minor
>
> This is an issue reported in the Spark user mailing list. Refer to http://comments.gmane.org/gmane.comp.lang.scala.spark.user/35742
> This issue does not occur in an interactive SparkR session, while it does occur when executing an R file.
> The following example code can be put into an R file to reproduce this issue:
> {code}
> .libPaths(c("/home/user/spark-1.6.1-bin-hadoop2.6/R/lib",.libPaths()))
> Sys.setenv(SPARK_HOME="/home/user/spark-1.6.1-bin-hadoop2.6")
> library("SparkR")
> sc <- sparkR.init(sparkPackages = "com.databricks:spark-csv_2.11:1.4.0")
> sqlContext <- sparkRSQL.init(sc)
> df <- read.df(sqlContext, "file:///home/user/spark-1.6.1-bin-hadoop2.6/data/mllib/sample_tree_data.csv","csv")
> showDF(df)
> {code}
> The error message is as such:
> {panel}
> 16/06/19 15:48:56 ERROR RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) :
> java.lang.ClassNotFoundException: Failed to find data source: csv. Please find packages at http://spark-packages.org
> at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
> at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.api.r.SQLUtils$.loadDF(SQLUtils.scala:160)
> at org.apache.spark.sql.api.r.SQLUtils.loadDF(SQLUtils.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
> at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala
> Calls: read.df -> callJStatic -> invokeJava
> Execution halted
> {panel}
> The reason behind this is that in case you execute an R file, the R backend launches before the R interpreter, so there is no opportunity for packages specified with ‘sparkPackages’ to be processed.
> This JIRA issue is to track this issue. An appropriate solution is to be discussed. Maybe documentation the limitation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org