You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shivaram Venkataraman (JIRA)" <ji...@apache.org> on 2016/07/13 06:31:20 UTC

[jira] [Created] (SPARK-16519) Handle SparkR RDD generics that create warnings in R CMD check

Shivaram Venkataraman created SPARK-16519:
---------------------------------------------

             Summary: Handle SparkR RDD generics that create warnings in R CMD check
                 Key: SPARK-16519
                 URL: https://issues.apache.org/jira/browse/SPARK-16519
             Project: Spark
          Issue Type: Sub-task
          Components: SparkR
            Reporter: Shivaram Venkataraman


One of the warnings we get from R CMD check is that RDD implementations of some of the generics are not documented. These generics are shared between RDD, DataFrames in SparkR. The list includes
{quote}
WARNING
Undocumented S4 methods:
  generic 'cache' and siglist 'RDD'
  generic 'collect' and siglist 'RDD'
  generic 'count' and siglist 'RDD'
  generic 'distinct' and siglist 'RDD'
  generic 'first' and siglist 'RDD'
  generic 'join' and siglist 'RDD,RDD'
  generic 'length' and siglist 'RDD'
  generic 'partitionBy' and siglist 'RDD'
  generic 'persist' and siglist 'RDD,character'
  generic 'repartition' and siglist 'RDD'
  generic 'show' and siglist 'RDD'
  generic 'take' and siglist 'RDD,numeric'
  generic 'unpersist' and siglist 'RDD'
{quote}

As described in https://stat.ethz.ch/pipermail/r-devel/2003-September/027490.html this looks like a limitation of R where exporting a generic from a package also exports all the implementations of that generic. 

One way to get around this is to remove the RDD API or rename the methods in Spark 2.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org