You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by sun-rui <gi...@git.apache.org> on 2015/12/09 08:15:29 UTC

[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

GitHub user sun-rui opened a pull request:

    https://github.com/apache/spark/pull/10220

    [SPARK-12235][SPARKR] Enhance mutate() to support replace existing columns.

    Make the behavior of mutate more consistent with that in dplyr, besides support for replacing existing columns.
    1. Throw error message when there are duplicated column names in the DataFrame being mutated.
    2. when there are duplicated column names in specified columns by arguments, the last column of the same name takes effect.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sun-rui/spark SPARK-12235

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10220
    
----
commit ac34da1e8385161fc50ce21a0595437a0854d78b
Author: Sun Rui <ru...@intel.com>
Date:   2015-12-09T07:14:33Z

    [SPARK-12235][SPARKR] Enhance mutate() to support replace existing columns.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-163142357
  
    **[Test build #47414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47414/consoleFull)** for PR 10220 at commit [`ac34da1`](https://github.com/apache/spark/commit/ac34da1e8385161fc50ce21a0595437a0854d78b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r61459532
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1451,17 +1451,54 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    -            stopifnot(class(cols[[1]]) == "Column")
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
    +            lapply(cols, function(col) {
    +              stopifnot(class(col) == "Column")
    +            })
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # For named arguments, use the names for arguments as the column names
    +            # For unnamed arguments, use the argument symbols as the column names
    +            args <- sapply(substitute(list(...))[-1], deparse)
    --- End diff --
    
    Hmm I see - it might be because the parsing is done at the point when `list(...)` is called. This is fine for now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-215320650
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-215487241
  
    LGTM. Thanks @sun-rui - Merging this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-163139798
  
    **[Test build #47414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47414/consoleFull)** for PR 10220 at commit [`ac34da1`](https://github.com/apache/spark/commit/ac34da1e8385161fc50ce21a0595437a0854d78b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-204768113
  
    looks good to me!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-202784857
  
    Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by 3ourroom <gi...@git.apache.org>.
Github user 3ourroom commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r47612640
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    --- End diff --
    
    
    NAVER - http://www.naver.com/
    --------------------------------------------
    
    3ourroom@naver.com 님께 보내신 메일 <Re: [spark] [SPARK-12235][SPARKR] Enhance mutate() to support replace existing columns. (#10220)> 이 다음과 같은 이유로 전송 실패했습니다.
    
    --------------------------------------------
    
    받는 사람이 회원님의 메일을 수신차단 하였습니다. 
    
    
    --------------------------------------------



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-202796428
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-163706535
  
    @felixcheung Could you see if this satisfies the requirements in https://issues.apache.org/jira/browse/SPARK-10346 ? The only other thing we had in mind was to match the signature of `mutate` in dplyr ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-214754654
  
    **[Test build #56999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56999/consoleFull)** for PR 10220 at commit [`3057817`](https://github.com/apache/spark/commit/30578175bf45f1f9e0a729c561281cd14185cbcb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r61376989
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1426,11 +1426,11 @@ setMethod("withColumn",
     
     #' Mutate
     #'
    -#' Return a new SparkDataFrame with the specified columns added.
    +#' Return a new SparkDataFrame with the specified columns added or replaced.
     #'
     #' @param .data A SparkDataFrame
     #' @param col a named argument of the form name = col
    -#' @return A new SparkDataFrame with the new columns added.
    +#' @return A new SparkDataFrame with the new columns added or replaced.
    --- End diff --
    
    added


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-202796223
  
    **[Test build #54423 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54423/consoleFull)** for PR 10220 at commit [`ac34da1`](https://github.com/apache/spark/commit/ac34da1e8385161fc50ce21a0595437a0854d78b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r58297375
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # The last column of the same name in the specific columns takes effect
                 ns <- names(cols)
    -            if (!is.null(ns)) {
    -              for (n in ns) {
    -                if (n != "") {
    -                  cols[[n]] <- alias(cols[[n]], n)
    -                }
    +            deDupCols <- list()
    +            for (i in 1:length(cols)) {
    +              if (!is.null(ns) && ns[[i]] != "") {
    +                deDupCols[[ns[[i]]]] <- alias(cols[[i]], ns[[i]])
    +              } else {
    +                # TODO: how to check if there are columns of the same name in unnamed Columns.
    +                deDupCols[[length(deDupCols) + 1]] <- cols[[i]]
                   }
                 }
    -            do.call(select, c(x, x$"*", cols))
    +
    +            # Construct the column list for projection
    +            ns <- names(deDupCols)
    --- End diff --
    
    nit: use a different name since `ns` is defined above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-202784704
  
    @felixcheung, @shivaram, could we continue the review?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r47620558
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    --- End diff --
    
    A DataFrame may have multiple columns of same name. This is mimic the behavior of mutate() in dplyr: Throw error message when there are duplicated column names in the DataFrame being mutated, even if the mutation is not related to these columns of duplicated names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10220


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-163142450
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47414/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-213551422
  
    Sorry for the delay @sun-rui - It mostly looks good, but I had one more question about this I left inline. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r61374122
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1451,17 +1451,54 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    -            stopifnot(class(cols[[1]]) == "Column")
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
    +            lapply(cols, function(col) {
    +              stopifnot(class(col) == "Column")
    +            })
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # For named arguments, use the names for arguments as the column names
    +            # For unnamed arguments, use the argument symbols as the column names
    +            args <- sapply(substitute(list(...))[-1], deparse)
    --- End diff --
    
    I did use cols. But the result is not correct. I have to use list(...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-215320437
  
    **[Test build #57223 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57223/consoleFull)** for PR 10220 at commit [`74ba7e8`](https://github.com/apache/spark/commit/74ba7e88e36f9f1393a91a168af95392bdbd2f53).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-163142448
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-202786837
  
    **[Test build #54423 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54423/consoleFull)** for PR 10220 at commit [`ac34da1`](https://github.com/apache/spark/commit/ac34da1e8385161fc50ce21a0595437a0854d78b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r58296947
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # The last column of the same name in the specific columns takes effect
                 ns <- names(cols)
    --- End diff --
    
    isn't `ns` the same as `dfCols` above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r60786195
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # The last column of the same name in the specific columns takes effect
                 ns <- names(cols)
    -            if (!is.null(ns)) {
    -              for (n in ns) {
    -                if (n != "") {
    -                  cols[[n]] <- alias(cols[[n]], n)
    -                }
    +            deDupCols <- list()
    +            for (i in 1:length(cols)) {
    +              if (!is.null(ns) && ns[[i]] != "") {
    +                deDupCols[[ns[[i]]]] <- alias(cols[[i]], ns[[i]])
    +              } else {
    +                # TODO: how to check if there are columns of the same name in unnamed Columns.
    +                deDupCols[[length(deDupCols) + 1]] <- cols[[i]]
    --- End diff --
    
    I'm not sure I quite understand when we go into the else case (compared to what we had before). So if I'm not wrong, before we were just dropping the non-empty column names and now we want to include them ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r58296904
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    --- End diff --
    
    perhaps add that in doc or in the error message to say something like "Error: cannot mutate DataFrame with duplicated column names"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r61361037
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1426,11 +1426,11 @@ setMethod("withColumn",
     
     #' Mutate
     #'
    -#' Return a new SparkDataFrame with the specified columns added.
    +#' Return a new SparkDataFrame with the specified columns added or replaced.
     #'
     #' @param .data A SparkDataFrame
     #' @param col a named argument of the form name = col
    -#' @return A new SparkDataFrame with the new columns added.
    +#' @return A new SparkDataFrame with the new columns added or replaced.
    --- End diff --
    
    It'll be good to add an example of how columns can be replaced in the `@examples` roxygen2 doc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r61071579
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # The last column of the same name in the specific columns takes effect
                 ns <- names(cols)
    -            if (!is.null(ns)) {
    -              for (n in ns) {
    -                if (n != "") {
    -                  cols[[n]] <- alias(cols[[n]], n)
    -                }
    +            deDupCols <- list()
    +            for (i in 1:length(cols)) {
    +              if (!is.null(ns) && ns[[i]] != "") {
    +                deDupCols[[ns[[i]]]] <- alias(cols[[i]], ns[[i]])
    +              } else {
    +                # TODO: how to check if there are columns of the same name in unnamed Columns.
    +                deDupCols[[length(deDupCols) + 1]] <- cols[[i]]
    --- End diff --
    
    will refactor the code to match dplyr.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-214750615
  
    **[Test build #56999 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56999/consoleFull)** for PR 10220 at commit [`3057817`](https://github.com/apache/spark/commit/30578175bf45f1f9e0a729c561281cd14185cbcb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r47738248
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    --- End diff --
    
    right, to clarify, any given DataFrame today could have columns with the same name, isn't it overly restrictive to say that one cannot call mutate on one of these DataFrame?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r47612556
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    --- End diff --
    
    isn't this supported with DataFrame in Scala or Python?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-163732757
  
    Sure, I'll check. We were discussing a bit in SPARK-12235


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-214754795
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56999/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r61360923
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1451,17 +1451,54 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    -            stopifnot(class(cols[[1]]) == "Column")
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
    +            lapply(cols, function(col) {
    +              stopifnot(class(col) == "Column")
    +            })
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # For named arguments, use the names for arguments as the column names
    +            # For unnamed arguments, use the argument symbols as the column names
    +            args <- sapply(substitute(list(...))[-1], deparse)
    --- End diff --
    
    Can we use `cols` here instead of `list(...)` as we already extracted it ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-215320655
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57223/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-215318404
  
    **[Test build #57223 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57223/consoleFull)** for PR 10220 at commit [`74ba7e8`](https://github.com/apache/spark/commit/74ba7e88e36f9f1393a91a168af95392bdbd2f53).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r47743134
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    --- End diff --
    
    Yes, per current implementation, one cannot call mutate a DataFrame having multiple columns of same name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-202796433
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54423/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10220#discussion_r61071492
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1357,17 +1357,46 @@ setMethod("mutate",
               function(.data, ...) {
                 x <- .data
                 cols <- list(...)
    -            stopifnot(length(cols) > 0)
    +            if (length(cols) <= 0) {
    +              return(x)
    +            }
    +
                 stopifnot(class(cols[[1]]) == "Column")
    +
    +            # Check if there is any duplicated column name in the DataFrame
    +            dfCols <- columns(x)
    +            if (length(unique(dfCols)) != length(dfCols)) {
    +              stop("Error: found duplicated column name in the DataFrame")
    +            }
    +
    +            # TODO: simplify the implementation of this method after SPARK-12225 is resolved.
    +
    +            # The last column of the same name in the specific columns takes effect
                 ns <- names(cols)
    --- End diff --
    
    no. dfCols are the column names of the dataframe, ns are the names of the columns to be added or replaced.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-214754793
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/10220#issuecomment-214750238
  
    @shivaram. rebased to master. and refactor code that For unnamed arguments, use the argument symbols as the column names. This is the same behavior as dplyr


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org