You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shivaram Venkataraman (JIRA)" <ji...@apache.org> on 2015/09/09 03:36:46 UTC

[jira] [Commented] (SPARK-10500) sparkr.zip cannot be created if $SPARK_HOME/R/lib is unwritable

    [ https://issues.apache.org/jira/browse/SPARK-10500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735965#comment-14735965 ] 

Shivaram Venkataraman commented on SPARK-10500:
-----------------------------------------------

Yeah I think we can try to make it reusable if we the spark-packages in it doesn't change - I think we zip all dependent Spark packages together which is why we do it everytime, but Burak cc'd can correct me if I am wrong. 

Providing a flag for this or using spark.local.dir might also be good fixes

cc [~brkyvz] [~sunrui]

> sparkr.zip cannot be created if $SPARK_HOME/R/lib is unwritable
> ---------------------------------------------------------------
>
>                 Key: SPARK-10500
>                 URL: https://issues.apache.org/jira/browse/SPARK-10500
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 1.5.0
>            Reporter: Jonathan Kelly
>
> As of SPARK-6797, sparkr.zip is re-created each time spark-submit is run with an R application, which fails if Spark has been installed into a directory to which the current user doesn't have write permissions. (e.g., on EMR's emr-4.0.0 release, Spark is installed at /usr/lib/spark, which is only writable by root.)
> Would it be possible to skip creating sparkr.zip if it already exists? That would enable sparkr.zip to be pre-created by the root user and then reused each time spark-submit is run, which I believe is similar to how pyspark works.
> Another option would be to make the location configurable, as it's currently hardcoded to $SPARK_HOME/R/lib/sparkr.zip. Allowing it to be configured to something like the user's home directory or a random path in /tmp would get around the permissions issue.
> By the way, why does spark-submit even need to re-create sparkr.zip every time a new R application is launched? This seems unnecessary and inefficient, unless you are actively developing the SparkR libraries and expect the contents of sparkr.zip to change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org