You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andrew Or (JIRA)" <ji...@apache.org> on 2015/06/30 02:29:04 UTC

[jira] [Closed] (SPARK-8019) [SparkR] Create worker R processes with a command other then Rscript

     [ https://issues.apache.org/jira/browse/SPARK-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Or closed SPARK-8019.
----------------------------
          Resolution: Fixed
       Fix Version/s: 1.5.0
    Target Version/s: 1.5.0

> [SparkR] Create worker R processes with a command other then Rscript
> --------------------------------------------------------------------
>
>                 Key: SPARK-8019
>                 URL: https://issues.apache.org/jira/browse/SPARK-8019
>             Project: Spark
>          Issue Type: New Feature
>          Components: SparkR
>            Reporter: Michael Sannella
>             Fix For: 1.5.0
>
>
> Currently, SparkR creates worker R processes by calling the command
> "Rscript", so it depends on R being installed with that command
> globally visible.
> This could be a problem if one wants to use an R engine that is not
> installed in this way.  For example, suppose that one has multiple
> versions of R on the worker machines, and wants to try a new version
> of R under SparkR before it has been formally installed.  Ideally, one
> could do this by running SparkR and specifying the full path name to
> the Rscript command (such as "/usr/local/R-alt/bin/Rscript").
> I faced this problem in a different situation: I am working on an
> alternate R engine (TERR), which has an alternate version of the
> Rscript command (TERRScript).  I could make TERR work with SparkR by
> setting up appropriate links from the file Rscript to my TERRscript,
> but I'd rather not disable normal access to R.
> I finally dealt with this by making a one-line change to
> core/src/main/scala/org/apache/spark/api/r/RRDD.scala (which I will
> shortly submit as a pull request for this bug) that uses the new
> environment variable "spark.sparkr.r.command" to get the path for
> spawning R engines.  If this variable is not defined, it defaults to
> "Rscript", so we get the old behavior.  With this change, I can start
> SparkR to use TERR with a command such as:
> {noformat}
> sc <- sparkR.init(
>         sparkEnvir=list(spark.sparkr.use.daemon="false",
>                         spark.sparkr.r.command="/usr/local/TERR/bin/TERRscript"))
> {noformat}
> This is a very low-risk change that could be generally useful to other
> people.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org