You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by olarayej <gi...@git.apache.org> on 2016/02/24 01:05:02 UTC

[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

GitHub user olarayej opened a pull request:

    https://github.com/apache/spark/pull/11336

    [SPARK-9325][SPARK-R] collect() head() and show() for Columns

    See attached design document
    
    [SparkR collect (JIRA doc).pdf](https://github.com/apache/spark/files/143656/SparkR.collect.JIRA.doc.pdf)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/olarayej/spark SPARK-9325

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11336.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11336
    
----
commit 6fc97e02975909cb72e27077aac97d4f90b332d5
Author: Oscar D. Lara Yejas <od...@oscars-mbp.usca.ibm.com>
Date:   2016-02-23T23:48:55Z

    Support for collect() on Columns

commit fbf9b02b478b8eb4845232e09932d068cb393fd8
Author: Oscar D. Lara Yejas <od...@oscars-mbp.usca.ibm.com>
Date:   2016-02-24T00:00:52Z

    Removed drop=F from other PR

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82907977
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1168,12 +1179,14 @@ setMethod("take",
     
     #' Head
     #'
    -#' Return the first \code{num} rows of a SparkDataFrame as a R data.frame. If \code{num} is not
    -#' specified, then head() returns the first 6 rows as with R data.frame.
    +#' Return the first elements of a dataset. If \code{x} is a SparkDataFrame, its first 
    +#' rows will be returned as a data.frame. If the dataset is a \code{Column}, its first 
    +#' elements will be returned as a vector. The number of elements to be returned
    +#' is given by parameter \code{num}. Default value for \code{num} is 6.
     #'
    -#' @param x a SparkDataFrame.
    -#' @param num the number of rows to return. Default is 6.
    -#' @return A data.frame.
    +#' @param x A SparkDataFrame or Column
    --- End diff --
    
    for something like this the convention we have is to add the @param in generics.R - you can see other examples there


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-189013094
  
    The office [document](http://spark.apache.org/docs/latest/index.html) says Spark supports `R 3.1+`, so it is reasonable to assume R 3.2.2 should work with SparkR as well. I run R 3.2.2.
    
    Is there a way to make it work with 3.1.x and 3.2.x?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203668276
  
    **[Test build #54559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54559/consoleFull)** for PR 11336 at commit [`e5659ee`](https://github.com/apache/spark/commit/e5659ee85391937cdf301bd1acc01487bc55c129).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188214706
  
    The R version used by Jenkins seems to be 3.2.x (can't remember clearly). There is a JIRA requesting using R 3.1.1, https://issues.apache.org/jira/browse/SPARK-11255, but it is not fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188105798
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67220 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67220/consoleFull)** for PR 11336 at commit [`76061ad`](https://github.com/apache/spark/commit/76061ad8320c0ab1a52da1d0a7ea96e2db5d8a68).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-191399711
  
    Also, the fact that the size of a column depends on the join seems counter-intuitive for an R user:
    
    ```
    > dim(irisDF2)
    [1] 150   5
    
    > dim(irisDF)
    [1] 150   5
    
    > x <- irisDF$Sepal_Length + irisDF2$Sepal_Length
    ```
    In R, x will always have 150 elements. However:
    
    ```
    # Cartesian product
    > df3 <- join(irisDF, irisDF2)
    > dim(select(df3, x))
    [1] 22500     1
    
    # Inner join by Species
    > df4 <- merge(irisDF, irisDF2, by="Species")
    > dim(select(df4, x))
    [1] 7500    1
    
    ```
    I still think SparkR shouldn't allow operations between columns coming from different DataFrames. And, in the case of a join, operations can be performed on the joined DataFrame (e.g., df3) as opposed to the original ones (e.g., irisDF and irisDF2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203672948
  
    Thanks @sun-rui @rxin @shivaram  for your inputs. To alleviate the confusion on which columns can/cannot be collected, I propose the following (already pushed the code):
    
    Currently there are 15 SparkR functions that return an ‘orphan’ Column with no parent DataFrame:
    ```
    rand, rand, unix_timestamp,
    struct, expr, column, lag, lead, lit, cume_dist, dense_rank,
    ntile, percent_rank, rank, row_number
    ```
    The first three (i.e., rand, randn, and unix_timestamp) can be nicely collected as single elements. For example:
    ```
    > rand()
    [1] 0.01483325
    ```
    The remaining ones don’t make sense unless there’s an associated DataFrame. Therefore, an empty vector will be returned:
    ```
    > column("Species")
    Species
    <Empty column>
    
    > collect(column("Species"))
    character(0)
    ```
    
    I think it makes sense: If you don’t associate a Column with a DataFrame, there’s nothing to be collected. Now, for Columns that do belong to a DataFrame, collecting columns SIGNIFICANTLY improves usability in 138 functions/operators (besides other issues in the design document), for example:
    
    > irisDF$Sepal_Length * 100
     [1] 510 490 470 460 500 540 460 500 440 490 540 480 480 430 580 570 540 510 570 510…
    
    versus:
    
    > head(select(irisDF, irisDF$Sepal_Length * 100), 20)[, 1]
     [1] 510 490 470 460 500 540 460 500 440 490 540 480 480 430 580 570 540 510 570 510
    
    @shivaram has a very valid point: this introduces discrepancies in the Spark API’s across multiple languages. I believe this is not necessarily bad as R, especially, is a slightly different animal which already has a specific behavior for columns (i.e., vectors).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @shivaram: Have you reviewed this? If the intent is to merge it, I'll gladly update the code. @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82908811
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -2865,7 +2882,11 @@ setMethod("rand", signature(seed = "numeric"),
     setMethod("randn", signature(seed = "missing"),
               function(seed) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "randn")
    -            column(jc)
    +
    +            # By assigning a one-row data.frame, the result of this function can be collected
    +            # returning a one-element Column
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66911/consoleFull)** for PR 11336 at commit [`445407c`](https://github.com/apache/spark/commit/445407caa11b0f555e32d51008fdf83d5c4ecd97).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66753 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66753/consoleFull)** for PR 11336 at commit [`8f906a2`](https://github.com/apache/spark/commit/8f906a2e8050c391355c3ddab811d990020728cb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @felixcheung @falaki I have addressed all your comments and tests pass now. Thank you! cc @aloknsingh 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Any update on this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82918200
  
    --- Diff: R/pkg/R/column.R ---
    @@ -32,35 +34,57 @@ setOldClass("jobj")
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +setMethod("show", signature = "Column", definition = function(object) {
    --- End diff --
    
    Sure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66911/consoleFull)** for PR 11336 at commit [`445407c`](https://github.com/apache/spark/commit/445407caa11b0f555e32d51008fdf83d5c4ecd97).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188141919
  
    **[Test build #51863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51863/consoleFull)** for PR 11336 at commit [`24b6154`](https://github.com/apache/spark/commit/24b6154b127ca57e17fa79831b2fee1edcc38b1b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67220/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    ping @olarayej 
    
    cc @felixcheung 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66973 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66973/consoleFull)** for PR 11336 at commit [`e0bba0a`](https://github.com/apache/spark/commit/e0bba0a0655f750751a6b9b05a7f6cf5eb807e05).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r83057347
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -2836,7 +2845,11 @@ setMethod("lpad", signature(x = "Column", len = "numeric", pad = "character"),
     setMethod("rand", signature(seed = "missing"),
               function(seed) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "rand")
    -            column(jc)
    +
    +            # By assigning a one-row data.frame, the result of this function can be collected
    +            # returning a one-element Column
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    @felixcheung That's a good idea. I have created a singleton accordingly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67536/consoleFull)** for PR 11336 at commit [`619f23b`](https://github.com/apache/spark/commit/619f23b367b94a47f3f63dea008ab4d674218b34).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67351/consoleFull)** for PR 11336 at commit [`1338d71`](https://github.com/apache/spark/commit/1338d71a8f8f1387e08326851e15fcc84d5092f9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66921 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66921/consoleFull)** for PR 11336 at commit [`1ace2e5`](https://github.com/apache/spark/commit/1ace2e5785030df790b3565d2a0b24358229d655).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82910308
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -2836,7 +2845,11 @@ setMethod("lpad", signature(x = "Column", len = "numeric", pad = "character"),
     setMethod("rand", signature(seed = "missing"),
               function(seed) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "rand")
    -            column(jc)
    +
    +            # By assigning a one-row data.frame, the result of this function can be collected
    +            # returning a one-element Column
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    See my comment from March 30 to illustrate why this is needed. I'll change sparkRSQL.init() to sparkR.session(). Thanks for catching this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188135108
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188006975
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188092589
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66834 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66834/consoleFull)** for PR 11336 at commit [`266d5ff`](https://github.com/apache/spark/commit/266d5ffd87929adf081944de19ee36f892125977).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67351/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188198216
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82907897
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1182,10 +1195,18 @@ setMethod("take",
     #' @export
     #' @examples
     #'\dontrun{
    -#' sparkR.session()
    -#' path <- "path/to/file.json"
    -#' df <- read.json(path)
    -#' head(df)
    +#' # Initialize Spark context and SQL context
    +#' sc <- sparkR.init()
    +#' sqlContext <- sparkRSQL.init(sc)
    --- End diff --
    
    ditto here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @olarayej @felixcheung Sorry I've missed the updates to this. I'll try to take a look at this later today ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66911/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-190334385
  
    Can any of you folks please take a look at the code? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188484496
  
    Thanks, folks. Looks like all test pass now! :-)
    
    However, on my environment (R 3.2.2), two tests don't pass. We should be careful whenever upgrading the R version:
    
    ```
    1. Failure (at test_sparkSQL.R#1052): column functions -------------------------
    result not equal to expected
    Names: 1 string mismatch
    
    2. Failure (at test_sparkSQL.R#1058): column functions -------------------------
    result not equal to expected
    Names: 1 string mismatch
    Error: Test failures
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-187996142
  
    Thanks @falaki!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-217585476
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @falaki Yeah but those warnings are making the build fail (see below). Is that okay? Now I see a new"Checks" section. I may be outdated with the protocols as it's been a while I didn't commit :-). Thanks!
    
    ```
    Failed -------------------------------------------------------------------------
    1. Error: column binary mathfunctions (@test_sparkSQL.R#1256) ------------------
    error in evaluating the argument 'x' in selecting a method for function 'collect': 
      error in evaluating the argument 'col' in selecting a method for function 'select': (converted from warning) 'sparkRSQL.init' is deprecated.
    Use 'sparkR.session' instead.
    See help("Deprecated")
    1: expect_equal(class(collect(select(df, rand()))[2, 1]), "numeric") at /home/jenkins/workspace/SparkPullRequestBuilder/R/lib/SparkR/tests/testthat/test_sparkSQL.R:1256
    2: compare(object, expected, ...)
    3: collect(select(df, rand()))
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82888061
  
    --- Diff: R/pkg/R/column.R ---
    @@ -32,35 +34,65 @@ setOldClass("jobj")
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +setMethod("show", signature = "Column", definition = function(object) {
    --- End diff --
    
    Maybe this can be implemented as `head` with default number of elements?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @shivaram @falaki @felixcheung Any additional comments? Otherwise, are we ready to merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188092590
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51846/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203665450
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66767/consoleFull)** for PR 11336 at commit [`ed0abf2`](https://github.com/apache/spark/commit/ed0abf24d7f65ad2381f6d664ba23e440013c97a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-195123603
  
    Yeah I'm not sure whats the right resolution on this -- On the one hand I have to say that the Spark DataFrame model where a column doesn't belong to one (or any) DataFrame is more confusing from a users perspective.  
    
    However the risk of adding wrappers which don't follow the underlying data model is that it might lead to inconsistencies where some parts of the API don't work like others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82908825
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -2876,7 +2897,8 @@ setMethod("randn", signature(seed = "missing"),
     setMethod("randn", signature(seed = "numeric"),
               function(seed) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "randn", as.integer(seed))
    -            column(jc)
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66767/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66844/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188056401
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51844/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188198023
  
    **[Test build #51863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51863/consoleFull)** for PR 11336 at commit [`24b6154`](https://github.com/apache/spark/commit/24b6154b127ca57e17fa79831b2fee1edcc38b1b).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188056400
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Just pointing out what might not be obvious for other observers.
    :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-187988404
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51820/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188198217
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51863/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188056381
  
    **[Test build #51844 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51844/consoleFull)** for PR 11336 at commit [`2d9ee18`](https://github.com/apache/spark/commit/2d9ee18dbdbcd25882a18e8fa0fa404d87c497c1).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-190867380
  
    SparkR doesn't support operations between columns from different DataFrame objects. Yet you can do:
    
    ```
    c1 <- df1$c1
    c2 <- df2$c2
    c3 < - c1 + c2
    ```
    c3 can't be used at all. See examples below:
    
    ```
    ## Create two DataFrames from Iris
    > irisDF <- createDataFrame(sqlContext, iris)
    > irisDF2 <- createDataFrame(sqlContext, iris)
    
    ## Create Column x, adding two Columns in two DataFrame's
    > x <- irisDF$Sepal_Length + irisDF2$Sepal_Length
    
    ## You can't use Column x as a predicate
    > irisDF[x > 0, ]
    16/03/01 11:04:19 ERROR RBackendHandler: filter on 76 failed
    Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
      org.apache.spark.sql.AnalysisException: resolved attribute(s) Sepal_Length#20 missing from Sepal_Length#15,Petal_Width#18,Sepal_Width#16,Petal_Length#17,Species#19 in operator !Filter ((Sepal_Length#15 + Sepal_Length#20) > 0.0);
    	
    ## You can't select Column x either
    > select(irisDF, x)
    16/03/01 11:04:43 ERROR RBackendHandler: select on 76 failed
    Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
      org.apache.spark.sql.AnalysisException: resolved attribute(s) Sepal_Length#20 missing from Sepal_Length#15,Petal_Width#18,Sepal_Width#16,Petal_Length#17,Species#19 in operator !Project [(Sepal_Length#15 + Sepal_Length#20) AS (Sepal_Length + Sepal_Length)#25];
    
    > select(irisDF2, x)
    16/03/01 11:04:45 ERROR RBackendHandler: select on 91 failed
    Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
      org.apache.spark.sql.AnalysisException: resolved attribute(s) Sepal_Length#15 missing from Sepal_Length#20,Sepal_Width#21,Species#24,Petal_Width#23,Petal_Length#22 in operator !Project [(Sepal_Length#15 + Sepal_Length#20) AS (Sepal_Length + Sepal_Length)#26];
    
    ```
    In my opinion, we should throw an error if the user is trying to operate on Columns coming from different DataFrames.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66905/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] head() and show() for Colum...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r84169371
  
    --- Diff: R/pkg/R/column.R ---
    @@ -29,38 +31,66 @@ setOldClass("jobj")
     #' @rdname column
     #'
     #' @slot jc reference to JVM SparkDataFrame column
    +#' @slot df the parent SparkDataFrame of the Column object
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +#' @rdname show
    +setMethod("show", signature = "Column", function(object) {
    +  MAX_ELEMENTS <- 20
    +  head.df <- head(object, MAX_ELEMENTS)
    +
    +  if (length(head.df) == 0) {
    +    colname <- callJMethod(object@jc, "toString")
    +    cat(paste0(colname, "\n"))
    +    cat(paste0("<Empty column>\n"))
    +  } else {
    +    show(head.df)
    +  }
    +  if (length(head.df) == MAX_ELEMENTS)  {
    +    cat(paste0("\b...\nDisplaying up to ", as.character(MAX_ELEMENTS), " elements only."))
    +  }
    +})
    +
    +#' @rdname head
    +setMethod("head",
    +          signature = "Column",
    +          function(x, num = 6L) {
    +            if (is.null(x@df)) {
    +              collect(x)
    --- End diff --
    
    so this means this here should be changed, or...?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66844 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66844/consoleFull)** for PR 11336 at commit [`257fa86`](https://github.com/apache/spark/commit/257fa8669eb0520d4827c0bc3d52a07037bd38f8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67351/consoleFull)** for PR 11336 at commit [`1338d71`](https://github.com/apache/spark/commit/1338d71a8f8f1387e08326851e15fcc84d5092f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82888215
  
    --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
    @@ -2252,6 +2252,31 @@ test_that("Method str()", {
       expect_equal(capture.output(utils:::str(iris)), capture.output(str(iris)))
     })
     
    +test_that("collect/show/head on Columns", {
    --- End diff --
    
    I think if we remove `collect()` from this PR the test needs to be updated as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82908799
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -2847,7 +2860,11 @@ setMethod("rand", signature(seed = "missing"),
     setMethod("rand", signature(seed = "numeric"),
               function(seed) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "rand", as.integer(seed))
    -            column(jc)
    +
    +            # By assigning a one-row data.frame, the result of this function can be collected
    +            # returning a one-element Column
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Folks?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] head() and show() for Colum...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r84150590
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -3321,3 +3328,11 @@ setMethod("randomSplit",
                 }
                 sapply(sdfs, dataFrame)
               })
    +
    +# A global singleton for an empty SparkR DataFrame.
    +getEmptySparkRDataFrame <- function() {
    +  if (is.null(.sparkREnv$EMPTY_DF)) {
    +    .sparkREnv$EMPTY_DF <- as.DataFrame(data.frame(0))
    +  }
    +  return(.sparkREnv$EMPTY_DF)
    --- End diff --
    
    get() would throw an error if the variable is not defined. I'll use exists()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203587396
  
    **[Test build #54546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54546/consoleFull)** for PR 11336 at commit [`c86bebb`](https://github.com/apache/spark/commit/c86bebb9ce02e5623d328d33109fff1ee8eeca56).
     * This patch **fails R style tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188006980
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51832/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-217583676
  
    **[Test build #58038 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58038/consoleFull)** for PR 11336 at commit [`9c1661f`](https://github.com/apache/spark/commit/9c1661f0fe892fd3dcd40e8e489a940c9b129747).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    I did another pass. It looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @falaki @shivaram Shall we merge this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-190996596
  
    @olarayej, 
    c3 can be used on a DataFrame that is joined between df1 & df2
    df3 <- join(df1, df2)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188367966
  
    Looks like the last one was just a flaky test in mllib. Lets try again 
    
    Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-218100918
  
    This tends to use a Column as a distributed vector :)  I don't object to add R-like features in SparkR, but am still not sure on this feature. I am neutral and up to other committer's vote


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67536/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203664895
  
    **[Test build #54557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54557/consoleFull)** for PR 11336 at commit [`bf6c456`](https://github.com/apache/spark/commit/bf6c45601a83d602853cce359d1d040cd36b1d8d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66832/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @wangmiao1981 These are required for the recently added CRAN checks to ensure we could release SparkR as a R package.
    1) - note in-line - they need to have `#' @rdname`
    2) - I'm not sure about "Column-class" we should check the generated .Rd file
    As for "head", I think it's because there is an existing generic and that has `...` in the signature
    Did it work if you skip the generic here `setGeneric("head")`https://github.com/apache/spark/pull/11336/files#diff-8e3d61ff66c9ffcd6ffb7a8eedc08409R555
    If not, we need to match the existing generic https://stat.ethz.ch/R-manual/R-devel/library/utils/html/head.html



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66767/consoleFull)** for PR 11336 at commit [`ed0abf2`](https://github.com/apache/spark/commit/ed0abf24d7f65ad2381f6d664ba23e440013c97a).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66973 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66973/consoleFull)** for PR 11336 at commit [`e0bba0a`](https://github.com/apache/spark/commit/e0bba0a0655f750751a6b9b05a7f6cf5eb807e05).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    I know why tests fail - please see my comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-187988403
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Sorry this got dropped from my radar as I was caught up with some other stuff. I am more free for the next 2-3 days. Will review this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188100679
  
    **[Test build #51848 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51848/consoleFull)** for PR 11336 at commit [`1e06d3c`](https://github.com/apache/spark/commit/1e06d3ccbb0cc441b9929f626812b8bc7dcbbda3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188138539
  
    @sun-rui @shivaram Do you know which version of R and SparkR's dependencies are being used in Jenkins? Tests run fine in my environment (have reviewed my code and ran unit tests many times). Wondering if that's due to a different version of R, testthat, devtools, rJava, etc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-209030229
  
    @shivaram @falaki @felixcheung @rxin @sun-rui Any thoughts on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-187988332
  
    **[Test build #51820 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51820/consoleFull)** for PR 11336 at commit [`fbf9b02`](https://github.com/apache/spark/commit/fbf9b02b478b8eb4845232e09932d068cb393fd8).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66844 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66844/consoleFull)** for PR 11336 at commit [`257fa86`](https://github.com/apache/spark/commit/257fa8669eb0520d4827c0bc3d52a07037bd38f8).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203585154
  
    **[Test build #54546 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54546/consoleFull)** for PR 11336 at commit [`c86bebb`](https://github.com/apache/spark/commit/c86bebb9ce02e5623d328d33109fff1ee8eeca56).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82908731
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -2836,7 +2845,11 @@ setMethod("lpad", signature(x = "Column", len = "numeric", pad = "character"),
     setMethod("rand", signature(seed = "missing"),
               function(seed) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "rand")
    -            column(jc)
    +
    +            # By assigning a one-row data.frame, the result of this function can be collected
    +            # returning a one-element Column
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    I think this is why test fails - do not use sparkRQL.init()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @felixcheung I'm done! Thanks for your comments!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66834 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66834/consoleFull)** for PR 11336 at commit [`266d5ff`](https://github.com/apache/spark/commit/266d5ffd87929adf081944de19ee36f892125977).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    one thing to highlight, previously we discussed a proposal to display a subset of values for a variable, such as when typing in the variable name in R shell or RStudio
    
    ```
    > a <- iris
    > a
        Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
    1            5.1         3.5          1.4         0.2     setosa
    2            4.9         3.0          1.4         0.2     setosa
    3            4.7         3.2          1.3         0.2     setosa
    ```
    
    This PR includes a change to how Column is presented in this case,
    
    before
    ```
    > a <- as.DataFrame(iris)
    > a
    SparkDataFrame[Sepal_Length:double, Sepal_Width:double, Petal_Length:double, Petal_Width:double, Species:string]
    > a$Sepal_Length
    Column Sepal_Length
    ```
    
    after
    ```
    > a$Sepal.Length
      [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
     [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
    ```
    
    So instead of type and name we would be doing a collect.
    
    In the earlier discussion, it was concluded that we would not be doing this without an explicit method call (like `head(col)`), unless, perhaps when the data frame is local.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-191003387
  
    @sun-rui Yes. In that case, c3 will be only associated to df3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-217586684
  
    I just got this branch up to date. Any comments, folks? @shivaram @falaki @felixcheung @rxin @sun-rui @mengxr 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-187981425
  
    **[Test build #51820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51820/consoleFull)** for PR 11336 at commit [`fbf9b02`](https://github.com/apache/spark/commit/fbf9b02b478b8eb4845232e09932d068cb393fd8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67348 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67348/consoleFull)** for PR 11336 at commit [`ed1b382`](https://github.com/apache/spark/commit/ed1b382524265340d0b692a89c6be388eddb549a).
     * This patch **fails R style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @felixcheung When a user types a variable name on the R shell, it triggers method showDefault() which, in turn, invokes show(). I wrote an implementation of show() for Column which, in turn, invokes head() (not collect), showing the first 20 elements of the dataset. This mimics R behavior and I think it also helps with usability. However, if the agreement is not to have that, I can just remove show method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67220 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67220/consoleFull)** for PR 11336 at commit [`76061ad`](https://github.com/apache/spark/commit/76061ad8320c0ab1a52da1d0a7ea96e2db5d8a68).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-187996241
  
    Jenkins, retest please. All tests pass for me after checking out this branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    I think some of the old tests still rely on `sparkRSQL.init()`. I believe the warning is OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66753 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66753/consoleFull)** for PR 11336 at commit [`8f906a2`](https://github.com/apache/spark/commit/8f906a2e8050c391355c3ddab811d990020728cb).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-193573753
  
    I personally find it confusing having to reason about when we can "head"/"collect"/"show" and when we cannot, and that's why the Scala/Python version of the API didn't have this feature.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r60285650
  
    --- Diff: R/pkg/R/column.R ---
    @@ -14,6 +14,8 @@
     # See the License for the specific language governing permissions and
     # limitations under the License.
     #
    +setOldClass("DataFrame")
    --- End diff --
    
    Is this necessary - setClass has been called on DataFrame?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188481153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51904/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203665451
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54557/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-190526619
  
    @olarayej, I am not sure if it is conceptually correct to associate a Column to only one DF. Conceptually, a Column could be depend on 0, 1, 2 or more DataFrames. For example:
    c1 <- df1$c1
    c2 <- df2$c2
    c3 < - c1 + c2



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    do you have tests for the case where the df is null, like https://github.com/apache/spark/pull/11336/files#diff-04c14efaae2b7b0f0a45038482f2590cR77?
    There are several other cases too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66921/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @falaki Sorry, I was out of town. Let me get back to this today. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66973/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @olarayej any update not his. If you are busy I can start another PR from yours.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82908439
  
    --- Diff: R/pkg/R/column.R ---
    @@ -32,35 +34,65 @@ setOldClass("jobj")
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +setMethod("show", signature = "Column", definition = function(object) {
    --- End diff --
    
    +1, default to 6 for consistency?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203672561
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] head() and show() for Colum...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r84189412
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -3321,3 +3328,11 @@ setMethod("randomSplit",
                 }
                 sapply(sdfs, dataFrame)
               })
    +
    +# A global singleton for an empty SparkR DataFrame.
    +getEmptySparkRDataFrame <- function() {
    +  if (is.null(.sparkREnv$EMPTY_DF)) {
    +    .sparkREnv$EMPTY_DF <- as.DataFrame(data.frame(0))
    +  }
    +  return(.sparkREnv$EMPTY_DF)
    --- End diff --
    
    sure, there are different existing style on that.
    
    I was referring to the naming though.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] head() and show() for Colum...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r84169566
  
    --- Diff: R/pkg/R/column.R ---
    @@ -29,38 +31,66 @@ setOldClass("jobj")
     #' @rdname column
     #'
     #' @slot jc reference to JVM SparkDataFrame column
    +#' @slot df the parent SparkDataFrame of the Column object
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +#' @rdname show
    +setMethod("show", signature = "Column", function(object) {
    +  MAX_ELEMENTS <- 20
    +  head.df <- head(object, MAX_ELEMENTS)
    +
    +  if (length(head.df) == 0) {
    +    colname <- callJMethod(object@jc, "toString")
    +    cat(paste0(colname, "\n"))
    +    cat(paste0("<Empty column>\n"))
    +  } else {
    +    show(head.df)
    +  }
    +  if (length(head.df) == MAX_ELEMENTS)  {
    +    cat(paste0("\b...\nDisplaying up to ", as.character(MAX_ELEMENTS), " elements only."))
    +  }
    +})
    +
    +#' @rdname head
    +setMethod("head",
    +          signature = "Column",
    --- End diff --
    
    also, this form is a bit implicit, could you update these signatures to say something like `signature(x = "jobj")` with the variable names


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188052587
  
    **[Test build #51844 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51844/consoleFull)** for PR 11336 at commit [`2d9ee18`](https://github.com/apache/spark/commit/2d9ee18dbdbcd25882a18e8fa0fa404d87c497c1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r83353715
  
    --- Diff: R/pkg/R/column.R ---
    @@ -70,11 +70,11 @@ setMethod("show", signature = "Column", function(object) {
     })
     
     #' @rdname head
    -setMethod("head", signature = "Column", definition = function(x, n=6) {
    +setMethod("head", signature = "Column", definition = function(x, num = 6) {
    --- End diff --
    
    as mentioned, could you update the format to match all other functions like this
    ```
     setMethod("head",
               signature(x = "SparkDataFrame"),
               function(x, num = 6L) {
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188481149
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66905 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66905/consoleFull)** for PR 11336 at commit [`0691c32`](https://github.com/apache/spark/commit/0691c32cacc3218742c7f345b4bc498ba2f826e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66753/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r83927201
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1166,26 +1166,33 @@ setMethod("take",
                 collect(limited)
               })
     
    -#' Head
    -#'
    -#' Return the first \code{num} rows of a SparkDataFrame as a R data.frame. If \code{num} is not
    -#' specified, then head() returns the first 6 rows as with R data.frame.
    -#'
    -#' @param x a SparkDataFrame.
    +#' Return the first part of a SparkDataFrame or Column
    +#' 
    +#' If \code{x} is a SparkDataFrame, its first 
    +#' rows will be returned as a data.frame. If the dataset is a \code{Column}, its first 
    +#' elements will be returned as a vector. The number of elements to be returned
    +#' is given by parameter \code{num}. Default value for \code{num} is 6.
    +#' @param x a SparkDataFrame or Column
     #' @param num the number of rows to return. Default is 6.
     #' @return A data.frame.
     #'
     #' @family SparkDataFrame functions
     #' @aliases head,SparkDataFrame-method
    -#' @rdname head
     #' @name head
     #' @export
     #' @examples
     #'\dontrun{
    +#' # Initialize Spark context and SQL context
    --- End diff --
    
    I don't think we need this line. Also we are using Spark Session as the terminology.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203672431
  
    **[Test build #54559 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54559/consoleFull)** for PR 11336 at commit [`e5659ee`](https://github.com/apache/spark/commit/e5659ee85391937cdf301bd1acc01487bc55c129).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66968 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66968/consoleFull)** for PR 11336 at commit [`2bfb8a6`](https://github.com/apache/spark/commit/2bfb8a62adf81e6953e398147e5f8d9122bc47ef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Happy New Year, folks! Any updates on this? @shivaram @falaki


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67536/consoleFull)** for PR 11336 at commit [`619f23b`](https://github.com/apache/spark/commit/619f23b367b94a47f3f63dea008ab4d674218b34).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188105691
  
    **[Test build #51848 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51848/consoleFull)** for PR 11336 at commit [`1e06d3c`](https://github.com/apache/spark/commit/1e06d3ccbb0cc441b9929f626812b8bc7dcbbda3).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r83154700
  
    --- Diff: R/pkg/R/column.R ---
    @@ -29,38 +31,61 @@ setOldClass("jobj")
     #' @rdname column
     #'
     #' @slot jc reference to JVM SparkDataFrame column
    +#' @slot df the parent SparkDataFrame of the Column object
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +setMethod("show", signature = "Column", function(object) {
    +  MAX_ELEMENTS <- 20
    +  head.df <- head(object, MAX_ELEMENTS)
    +
    +  if (length(head.df) == 0) {
    +    colname <- callJMethod(object@jc, "toString")
    +    cat(paste0(colname, "\n"))
    +    cat(paste0("<Empty column>\n"))
    +  } else {
    +    show(head.df)
    +  }
    +  if (length(head.df) == MAX_ELEMENTS)  {
    +    cat(paste0("\b...\nDisplaying up to ", as.character(MAX_ELEMENTS), " elements only."))
    +  }
    +})
    +
    +setMethod("head", signature = "Column", definition = function(x, n=6) {
    --- End diff --
    
    ditto here, `#' @rdname head`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66921/consoleFull)** for PR 11336 at commit [`1ace2e5`](https://github.com/apache/spark/commit/1ace2e5785030df790b3565d2a0b24358229d655).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67348/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188365387
  
    It was changed to 3.1.1 actually but there is a request to change to 3.1.2 hence the JIRA was reopened.
    
    
    
    
    
    
    On Wed, Feb 24, 2016 at 3:43 AM -0800, "sun-rui" <no...@github.com> wrote:
    
    
    
    
    
    The R version used by Jenkins seems to be 3.2.x (can't remember clearly). There is a JIRA requesting using R 3.1.1, https://issues.apache.org/jira/browse/SPARK-11255, but it is not fixed.
    
    ---
    Reply to this email directly or view it on GitHub:
    https://github.com/apache/spark/pull/11336#issuecomment-188214706



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-192400350
  
    @sun-rui Does that make sense to you? @shivaram @felixcheung Any comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-193561063
  
    A column can be applied to different dataframes.
    For example, if both df1 and df2 have a column named "col",then
    col <- column("col")
    collect(select(df1, col))
    collect(select(df2, col))
    both works.
    
    Take the join case above as example,
    You have can different DataFrames resulting from different joins on both df1 and df2,
    and apply c3 to the different resulting DataFrames also work.
    
    So how do you know which dataFrame to associate with a column in such cases?
    
    @rxin, any comments on this issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188471076
  
    **[Test build #51904 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51904/consoleFull)** for PR 11336 at commit [`6a38a3c`](https://github.com/apache/spark/commit/6a38a3ca79087538dd7d3452a3cb487f9b0b6dd2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82911244
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1035,10 +1035,16 @@ setMethod("dim",
                 c(count(x), ncol(x))
               })
     
    -#' Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
    +#' Download Spark datasets into R
    --- End diff --
    
    Sure. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by aloknsingh <gi...@git.apache.org>.
Github user aloknsingh commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-194548514
  
    Hi @rxin ,
    If we see that in general R have more features than python and other language as far as stats and ml is concerned. So it would be nice to have more R like approach than having the global goal of compatibility in all the language support. so as long as JavaDataFrame is compatible in all the language one can always add more functions on the language specific dataframe i.e in this case R_DataFrame. Even though these features are not the main goal of SparkR , I think R user will find it more R like and appreciate it. Just come thoughts :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188105847
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51848/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @felixcheung @falaki I have addressed all your comments. Shall we merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-189053702
  
    @felixcheung It wasn't an R issue after all. The problem was that I hadn't been able to rebuild Spark in the last couple days due to SPARK-13431, and I needed changes on SPARK-12799. Now that it’s fixed, everything runs fine on R 3.2.2. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82908836
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -3026,7 +3048,11 @@ setMethod("translate",
     setMethod("unix_timestamp", signature(x = "missing", format = "missing"),
               function(x, format) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "unix_timestamp")
    -            column(jc)
    +
    +            # By assigning a one-row data.frame, the result of this function can be collected
    +            # returning a one-element Column
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188092501
  
    **[Test build #51846 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51846/consoleFull)** for PR 11336 at commit [`dc3df19`](https://github.com/apache/spark/commit/dc3df1926bb3931459a0ea381e5695b7bb56ca15).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188135114
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51860/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-187988882
  
    Nice! thanks for doing this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @falaki Thanks for your comments. Yeah, before removing collect/show, I just wanted to rebase to current upstream master. I'm getting a build error which is actually a warning, not even an R error:
    
    ```
    (converted from warning) 'sparkRSQL.init' is deprecated.
    Use 'sparkR.session' instead.
    ```
    
    I don't explicitly use sparkRSQL.init anywhere in my code so I'm investigating. If you have any suggestion, it'd be nice. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82911261
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1049,11 +1055,16 @@ setMethod("dim",
     #' @export
     #' @examples
     #'\dontrun{
    -#' sparkR.session()
    -#' path <- "path/to/file.json"
    -#' df <- read.json(path)
    -#' collected <- collect(df)
    -#' firstName <- collected[[1]]$name
    +#' # Initialize Spark context and SQL context
    +#' sc <- sparkR.init()
    +#' sqlContext <- sparkRSQL.init(sc)
    --- End diff --
    
    Sure. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66834/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @shivaram @felixcheung @falaki Any thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82887849
  
    --- Diff: R/pkg/R/column.R ---
    @@ -32,35 +34,65 @@ setOldClass("jobj")
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +setMethod("show", signature = "Column", definition = function(object) {
    +  MAX_ELEMENTS <- 20
    +  head.df <- head(object, MAX_ELEMENTS)
    +
    +  if (length(head.df) == 0) {
    +    colname <- callJMethod(object@jc, "toString")
    +    cat(paste0(colname, "\n"))
    +    cat(paste0("<Empty column>\n"))
    +  } else {
    +    show(head.df)
    +  }
    +  if (length(head.df) == MAX_ELEMENTS)  {
    +    cat(paste0("\b...\nDisplaying up to ", as.character(MAX_ELEMENTS), " elements only."))
    +  }
    +})
    +
    +setMethod("collect", signature = "Column", definition = function(x) {
    --- End diff --
    
    It seems Spark committers are still not sure about this feature. Maybe we can remove it for now and once the discussion is resolved we work on it as a separate PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66905 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66905/consoleFull)** for PR 11336 at commit [`0691c32`](https://github.com/apache/spark/commit/0691c32cacc3218742c7f345b4bc498ba2f826e5).
     * This patch **fails R style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188480948
  
    **[Test build #51904 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51904/consoleFull)** for PR 11336 at commit [`6a38a3c`](https://github.com/apache/spark/commit/6a38a3ca79087538dd7d3452a3cb487f9b0b6dd2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r83926993
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1166,26 +1166,33 @@ setMethod("take",
                 collect(limited)
               })
     
    -#' Head
    -#'
    -#' Return the first \code{num} rows of a SparkDataFrame as a R data.frame. If \code{num} is not
    -#' specified, then head() returns the first 6 rows as with R data.frame.
    -#'
    -#' @param x a SparkDataFrame.
    +#' Return the first part of a SparkDataFrame or Column
    +#' 
    +#' If \code{x} is a SparkDataFrame, its first 
    +#' rows will be returned as a data.frame. If the dataset is a \code{Column}, its first 
    +#' elements will be returned as a vector. The number of elements to be returned
    +#' is given by parameter \code{num}. Default value for \code{num} is 6.
    +#' @param x a SparkDataFrame or Column
     #' @param num the number of rows to return. Default is 6.
     #' @return A data.frame.
     #'
     #' @family SparkDataFrame functions
     #' @aliases head,SparkDataFrame-method
    -#' @rdname head
    --- End diff --
    
    why is this removed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r83927478
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1166,26 +1166,33 @@ setMethod("take",
                 collect(limited)
               })
     
    -#' Head
    -#'
    -#' Return the first \code{num} rows of a SparkDataFrame as a R data.frame. If \code{num} is not
    -#' specified, then head() returns the first 6 rows as with R data.frame.
    -#'
    -#' @param x a SparkDataFrame.
    +#' Return the first part of a SparkDataFrame or Column
    +#' 
    +#' If \code{x} is a SparkDataFrame, its first 
    +#' rows will be returned as a data.frame. If the dataset is a \code{Column}, its first 
    +#' elements will be returned as a vector. The number of elements to be returned
    +#' is given by parameter \code{num}. Default value for \code{num} is 6.
    +#' @param x a SparkDataFrame or Column
     #' @param num the number of rows to return. Default is 6.
     #' @return A data.frame.
     #'
     #' @family SparkDataFrame functions
     #' @aliases head,SparkDataFrame-method
    -#' @rdname head
     #' @name head
     #' @export
     #' @examples
     #'\dontrun{
    +#' # Initialize Spark context and SQL context
     #' sparkR.session()
    -#' path <- "path/to/file.json"
    -#' df <- read.json(path)
    -#' head(df)
    +#' 
    +#' # Create a DataFrame from the Iris dataset
    +#' irisDF <- as.DataFrame(iris)
    --- End diff --
    
    you might want to avoid using `iris` in example, because it will cause a warning with the column name having `.` and if at one point we fix this to support `.` in column name, the example will need to be updated


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @falaki @felixcheung I have addressed all your comments. I'm getting two documentation warnings which seem to be making the build fail:
    
    1)
    ```
    Undocumented S4 methods:
      generic 'head' and siglist 'Column'
      generic 'show' and siglist 'Column'
    ```
    The documentation for these is on Data.Frame.R. I don't see a need for duplicating the docs in column.R
    
    
    2)
    ```
    Undocumented arguments in documentation object 'Column-class'
      'df'
    
    Undocumented arguments in documentation object 'head'
      '...'
    ```
    I do have documentation for slot df in class Column. And also, I don't have ... as part of the signature of method head. Not sure why this warning comes up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @olarayej are you interested in rebasing this PR and limiting its user impact to just `head` for now? We can continue the discussion on support for distributed vector (a.k.a. `Column`) in JIRA and when resolved create follow up PRs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #67348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67348/consoleFull)** for PR 11336 at commit [`ed1b382`](https://github.com/apache/spark/commit/ed1b382524265340d0b692a89c6be388eddb549a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] head() and show() for Colum...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r84189253
  
    --- Diff: R/pkg/R/column.R ---
    @@ -29,38 +31,66 @@ setOldClass("jobj")
     #' @rdname column
     #'
     #' @slot jc reference to JVM SparkDataFrame column
    +#' @slot df the parent SparkDataFrame of the Column object
     #' @export
     #' @note Column since 1.4.0
     setClass("Column",
    -         slots = list(jc = "jobj"))
    +         slots = list(jc = "jobj", df = "SparkDataFrameOrNull"))
     
     #' A set of operations working with SparkDataFrame columns
     #' @rdname columnfunctions
     #' @name columnfunctions
     NULL
    -
    -setMethod("initialize", "Column", function(.Object, jc) {
    +setMethod("initialize", "Column", function(.Object, jc, df) {
       .Object@jc <- jc
    +
    +  # Some Column objects don't have any referencing DataFrame. In such case, df will be NULL.
    +  if (missing(df)) {
    +    df <- NULL
    +  }
    +  .Object@df <- df
       .Object
     })
     
    +#' @rdname show
    +setMethod("show", signature = "Column", function(object) {
    +  MAX_ELEMENTS <- 20
    +  head.df <- head(object, MAX_ELEMENTS)
    +
    +  if (length(head.df) == 0) {
    +    colname <- callJMethod(object@jc, "toString")
    +    cat(paste0(colname, "\n"))
    +    cat(paste0("<Empty column>\n"))
    +  } else {
    +    show(head.df)
    +  }
    +  if (length(head.df) == MAX_ELEMENTS)  {
    +    cat(paste0("\b...\nDisplaying up to ", as.character(MAX_ELEMENTS), " elements only."))
    +  }
    +})
    +
    +#' @rdname head
    +setMethod("head",
    +          signature = "Column",
    --- End diff --
    
    There are more than 1 in the file...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82919122
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1168,12 +1179,14 @@ setMethod("take",
     
     #' Head
     #'
    -#' Return the first \code{num} rows of a SparkDataFrame as a R data.frame. If \code{num} is not
    -#' specified, then head() returns the first 6 rows as with R data.frame.
    +#' Return the first elements of a dataset. If \code{x} is a SparkDataFrame, its first 
    +#' rows will be returned as a data.frame. If the dataset is a \code{Column}, its first 
    +#' elements will be returned as a vector. The number of elements to be returned
    +#' is given by parameter \code{num}. Default value for \code{num} is 6.
     #'
    -#' @param x a SparkDataFrame.
    -#' @param num the number of rows to return. Default is 6.
    -#' @return A data.frame.
    +#' @param x A SparkDataFrame or Column
    --- End diff --
    
    Not sure I follow here. Could you point to the specific example?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-217585477
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58038/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    I think @shivaram should review this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188084110
  
    **[Test build #51846 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51846/consoleFull)** for PR 11336 at commit [`dc3df19`](https://github.com/apache/spark/commit/dc3df1926bb3931459a0ea381e5695b7bb56ca15).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188046181
  
    @AmplabJenkins Jenkins, could you retest please? I see ERROR: Error fetching remote repo 'origin'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203672562
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54559/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188131261
  
    **[Test build #51860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51860/consoleFull)** for PR 11336 at commit [`d697d44`](https://github.com/apache/spark/commit/d697d447ab8467eedade5d6e4c130dcf9a251e5e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203587409
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54546/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66832 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66832/consoleFull)** for PR 11336 at commit [`20e53e8`](https://github.com/apache/spark/commit/20e53e83a4d564c17d0b180ae5eab8ca9d6c1410).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66968/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203587405
  
    Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-203665448
  
    **[Test build #54557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54557/consoleFull)** for PR 11336 at commit [`bf6c456`](https://github.com/apache/spark/commit/bf6c45601a83d602853cce359d1d040cd36b1d8d).
     * This patch **fails R style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-212054765
  
    Thanks @olarayej for the clarifications - Just broadening the discussion a bit:
    @mengxr @thunterdb Could you comment if this will be useful in some of the ML use cases you are working on ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-188135069
  
    **[Test build #51860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51860/consoleFull)** for PR 11336 at commit [`d697d44`](https://github.com/apache/spark/commit/d697d447ab8467eedade5d6e4c130dcf9a251e5e).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    could you update the JIRA and PR to reflect what's being added here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @felixcheung @falaki Folks?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] head() and show() for Colum...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r84993459
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1166,12 +1166,13 @@ setMethod("take",
                 collect(limited)
               })
     
    -#' Head
    -#'
    -#' Return the first \code{num} rows of a SparkDataFrame as a R data.frame. If \code{num} is not
    -#' specified, then head() returns the first 6 rows as with R data.frame.
    -#'
    -#' @param x a SparkDataFrame.
    +#' Return the first part of a SparkDataFrame or Column
    +#' 
    +#' If \code{x} is a SparkDataFrame, its first 
    +#' rows will be returned as a data.frame. If the dataset is a \code{Column}, its first 
    +#' elements will be returned as a vector. The number of elements to be returned
    +#' is given by parameter \code{num}. Default value for \code{num} is 6.
    +#' @param x a SparkDataFrame or Column
    --- End diff --
    
    Same go for "show", I believe: https://github.com/olarayej/spark/blob/1338d71a8f8f1387e08326851e15fcc84d5092f9/R/pkg/R/DataFrame.R#L230


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #97866 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97866/consoleFull)** for PR 11336 at commit [`619f23b`](https://github.com/apache/spark/commit/619f23b367b94a47f3f63dea008ab4d674218b34).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82887445
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1035,10 +1035,17 @@ setMethod("dim",
                 c(count(x), ncol(x))
               })
     
    -#' Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
    +#' Download Spark datasets into R
     #'
    -#' @param x a SparkDataFrame.
    -#' @param stringsAsFactors (Optional) a logical indicating whether or not string columns
    +#' If applied to a SparkDataFrame, \code{collect} returns a data.frame. If applied to a 
    --- End diff --
    
    Do we still want this documentation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11336#issuecomment-217585428
  
    **[Test build #58038 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58038/consoleFull)** for PR 11336 at commit [`9c1661f`](https://github.com/apache/spark/commit/9c1661f0fe892fd3dcd40e8e489a940c9b129747).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66968 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66968/consoleFull)** for PR 11336 at commit [`2bfb8a6`](https://github.com/apache/spark/commit/2bfb8a62adf81e6953e398147e5f8d9122bc47ef).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82911304
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -2836,7 +2845,11 @@ setMethod("lpad", signature(x = "Column", len = "numeric", pad = "character"),
     setMethod("rand", signature(seed = "missing"),
               function(seed) {
                 jc <- callJStatic("org.apache.spark.sql.functions", "rand")
    -            column(jc)
    +
    +            # By assigning a one-row data.frame, the result of this function can be collected
    +            # returning a one-element Column
    +            df <- as.DataFrame(sparkRSQL.init(), data.frame(0))
    --- End diff --
    
    actually, just change it to `as.DataFrame(data.frame(0))`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    **[Test build #66832 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66832/consoleFull)** for PR 11336 at commit [`20e53e8`](https://github.com/apache/spark/commit/20e53e83a4d564c17d0b180ae5eab8ca9d6c1410).
     * This patch **fails some tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by olarayej <gi...@git.apache.org>.
Github user olarayej commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    @falaki Absolutely. Let me do that. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/11336
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #11336: [SPARK-9325][SPARK-R] collect() head() and show()...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11336#discussion_r82907729
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1035,10 +1035,16 @@ setMethod("dim",
                 c(count(x), ncol(x))
               })
     
    -#' Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
    +#' Download Spark datasets into R
    --- End diff --
    
    I'm not sure this should say "datasets" - we don't have this term elsewhere


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org