You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shivaram Venkataraman (JIRA)" <ji...@apache.org> on 2015/09/09 03:36:46 UTC
[jira] [Commented] (SPARK-10500) sparkr.zip cannot be created if
$SPARK_HOME/R/lib is unwritable
[ https://issues.apache.org/jira/browse/SPARK-10500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735965#comment-14735965 ]
Shivaram Venkataraman commented on SPARK-10500:
-----------------------------------------------
Yeah I think we can try to make it reusable if we the spark-packages in it doesn't change - I think we zip all dependent Spark packages together which is why we do it everytime, but Burak cc'd can correct me if I am wrong.
Providing a flag for this or using spark.local.dir might also be good fixes
cc [~brkyvz] [~sunrui]
> sparkr.zip cannot be created if $SPARK_HOME/R/lib is unwritable
> ---------------------------------------------------------------
>
> Key: SPARK-10500
> URL: https://issues.apache.org/jira/browse/SPARK-10500
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 1.5.0
> Reporter: Jonathan Kelly
>
> As of SPARK-6797, sparkr.zip is re-created each time spark-submit is run with an R application, which fails if Spark has been installed into a directory to which the current user doesn't have write permissions. (e.g., on EMR's emr-4.0.0 release, Spark is installed at /usr/lib/spark, which is only writable by root.)
> Would it be possible to skip creating sparkr.zip if it already exists? That would enable sparkr.zip to be pre-created by the root user and then reused each time spark-submit is run, which I believe is similar to how pyspark works.
> Another option would be to make the location configurable, as it's currently hardcoded to $SPARK_HOME/R/lib/sparkr.zip. Allowing it to be configured to something like the user's home directory or a random path in /tmp would get around the permissions issue.
> By the way, why does spark-submit even need to re-create sparkr.zip every time a new R application is launched? This seems unnecessary and inefficient, unless you are actively developing the SparkR libraries and expect the contents of sparkr.zip to change.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org