You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michael Sannella (JIRA)" <ji...@apache.org> on 2015/06/01 21:54:27 UTC
[jira] [Created] (SPARK-8019) [SparkR] Create worker R processes
with a command other then Rscript
Michael Sannella created SPARK-8019:
---------------------------------------
Summary: [SparkR] Create worker R processes with a command other then Rscript
Key: SPARK-8019
URL: https://issues.apache.org/jira/browse/SPARK-8019
Project: Spark
Issue Type: New Feature
Components: SparkR
Reporter: Michael Sannella
Currently, SparkR creates worker R processes by calling the command
"Rscript", so it depends on R being installed with that command
globally visible.
This could be a problem if one wants to use an R engine that is not
installed in this way. For example, suppose that one has multiple
versions of R on the worker machines, and wants to try a new version
of R under SparkR before it has been formally installed. Ideally, one
could do this by running SparkR and specifying the full path name to
the Rscript command (such as "/usr/local/R-alt/bin/Rscript").
I faced this problem in a different situation: I am working on an
alternate R engine (TERR), which has an alternate version of the
Rscript command (TERRScript). I could make TERR work with SparkR by
setting up appropriate links from the file Rscript to my TERRscript,
but I'd rather not disable normal access to R.
I finally dealt with this by making a one-line change to
core/src/main/scala/org/apache/spark/api/r/RRDD.scala (which I will
shortly submit as a pull request for this bug) that uses the new
environment variable "spark.sparkr.r.command" to get the path for
spawning R engines. If this variable is not defined, it defaults to
"Rscript", so we get the old behavior. With this change, I can start
SparkR to use TERR with a command such as:
{noformat}
sc <- sparkR.init(
sparkEnvir=list(spark.sparkr.use.daemon="false",
spark.sparkr.r.command="/usr/local/TERR/bin/TERRscript"))
{noformat}
This is a very low-risk change that could be generally useful to other
people.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org