You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by Eric Charles <er...@apache.org> on 2015/12/30 15:04:33 UTC

[DISCUSS] PR #208 - R Interpreter for Zeppelin

Hi,

I had a look at https://github.com/apache/incubator-zeppelin/pull/208 
(and related Github repo https://github.com/elbamos/Zeppelin-With-R [1])

Here are a few topics for discussion based on my experience developing 
https://github.com/datalayer/zeppelin-R [2].

1. rscala jar not in Maven Repository

[1] copies the source (scala and R) code from rscala repo and 
changes/extends/repackages it a bit. [2] declares the jar as system 
scoped library. I recently had incompatibly issues between the 1.0.8 
(the one you get since 2015-12-10 when you install rscala on your R 
environment) and the 1.0.6 jar I am using part of the zeppelin-R build. 
To avoid such issues, why not the user choosing the version via a 
property at build time to fit the version he runs on its host? This will 
also allow to benefit from the next rscala releases which fix bugs, 
bring not features... This also means we don't have to copy the rscala 
code in Zeppelin tree.

2. Interpreters

[1] proposes 2 interpreters %sparkr.r and %sparkr.knitr which are 
implemented in their own module apart from the Spark one. To be aligned 
the existing pyspark implementation, why not integrating the R code into 
the Spark one? Any reason to keep 2 versions which does basically the 
same? The unique magic keyword would then be %spark.r

3. Rendering TABLE plot when interpreter result is a dataframe

This may be confusing. What if I display a plot and simply want to print 
the first 10 rows at the end of my code? To keep the same behavior as 
the other interpreters, we could make this feature optional (disabled by 
default, enabled via property).


Thx, Eric