You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by felixcheung <gi...@git.apache.org> on 2015/11/05 09:31:08 UTC

[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

GitHub user felixcheung opened a pull request:

    https://github.com/apache/spark/pull/9489

    [SPARK-11468][SPARKR] add stddev/variance agg functions for Column

    Checked names, none of them should conflict with anything in base
    
    @shivaram @davies @rxin 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/felixcheung/spark rstddev

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9489.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9489
    
----
commit 680b4759e04c7d371fc555e220730e0a4da251f5
Author: felixcheung <fe...@hotmail.com>
Date:   2015-11-05T02:37:04Z

    Add stddev and friends

commit f63608e3d36bae0544281a94c732760bfebfcc6d
Author: felixcheung <fe...@hotmail.com>
Date:   2015-11-05T05:05:57Z

    add aggFunction by name and on Column

commit e0fda371d3691a21672a94e9e0b383890cc745a6
Author: felixcheung <fe...@hotmail.com>
Date:   2015-11-05T08:27:15Z

    Add tests, checked supported functions for GroupedData

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-154202141
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9489#discussion_r44482223
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -1168,6 +1248,54 @@ setMethod("upper",
                 column(jc)
               })
     
    +#' variance
    +#'
    +#' Aggregate function: alias for \link{var_samp}.
    +#'
    +#' @rdname variance
    +#' @name variance
    +#' @family agg_funcs
    +#' @export
    +#' @examples \dontrun{variance(df$c)}
    +setMethod("variance",
    +          signature(x = "Column"),
    +          function(x) {
    --- End diff --
    
    Same here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9489#discussion_r44383124
  
    --- Diff: R/pkg/NAMESPACE ---
    @@ -228,6 +233,9 @@ exportMethods("%in%",
                   "unhex",
                   "unix_timestamp",
                   "upper",
    +              "variance",
    +              "var_samp",
    +              "var_pop",
    --- End diff --
    
    "var_pop" before "var_samp",


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9489#discussion_r44482198
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -974,6 +1006,54 @@ setMethod("soundex",
                 column(jc)
               })
     
    +#' stddev
    +#'
    +#' Aggregate function: alias for \link{stddev_samp}
    +#'
    +#' @rdname stddev
    +#' @name stddev
    +#' @family agg_funcs
    +#' @export
    +#' @examples \dontrun{stddev(df$c)}
    +setMethod("stddev",
    +          signature(x = "Column"),
    +          function(x) {
    --- End diff --
    
    This is an alias on the Scala/Spark side


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155649088
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153988731
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-154210733
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45148/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-154202755
  
    **[Test build #45148 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45148/consoleFull)** for PR 9489 at commit [`7a2a904`](https://github.com/apache/spark/commit/7a2a90468623814c8bfc46c22a5058b350951a64).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9489#discussion_r44383102
  
    --- Diff: R/pkg/NAMESPACE ---
    @@ -209,7 +210,11 @@ exportMethods("%in%",
                   "sin",
                   "sinh",
                   "size",
    +              "skewness",
                   "soundex",
    +              "stddev",
    +              "stddev_samp",
    +              "stddev_pop",
    --- End diff --
    
    "stddev_pop" before "stddev_samp",


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155643453
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155681439
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9489


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153991209
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45107/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155246119
  
    Looks like dply has aggregate functions for variance, standard deviations ? Can we just match the syntax there ? https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153998372
  
    **[Test build #45109 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45109/consoleFull)** for PR 9489 at commit [`58d0fd9`](https://github.com/apache/spark/commit/58d0fd91843215cf05cd524bcfd75b9922fd59b8).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155250719
  
    This does work
    ```
    > head(agg(df, stddev(df$age)))    # same as head(summarize(df, stddev(df$age)))
      stddev_samp(age)
    1         7.778175
    ```
    Also these work
    ```
    > gd <- groupBy(df, "name")
    > head(agg(gd, stddev(df$age)))    # same as head(agg(gd, age = "stddev"))
         name stddev_samp(age)
    1    Andy                0
    2 Michael               NA
    3  Justin                0
    ```
    I think they match dplyr, so I could name them `sd` and `var`, but just as I mention, doesn't work as one would expect on data.frame column vector.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153991118
  
    **[Test build #45109 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45109/consoleFull)** for PR 9489 at commit [`58d0fd9`](https://github.com/apache/spark/commit/58d0fd91843215cf05cd524bcfd75b9922fd59b8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155633355
  
    **[Test build #45581 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45581/consoleFull)** for PR 9489 at commit [`9243fa4`](https://github.com/apache/spark/commit/9243fa4b3e508a4be2932da52696db05de7180bf).
     * This patch **fails some tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155516128
  
    Yeah in that case lets add aliases for `sd` and `var` as well and match the population/sample behavior.
    
    @felixcheung I think similar functions like `mean` don't match base R data frame but they are useful when used inside an expression and so its fine to have aliases for this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153987317
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9489#discussion_r44383281
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -1168,6 +1248,54 @@ setMethod("upper",
                 column(jc)
               })
     
    +#' variance
    +#'
    +#' Aggregate function: alias for \link{var_samp}.
    +#'
    +#' @rdname variance
    +#' @name variance
    +#' @family agg_funcs
    +#' @export
    +#' @examples \dontrun{variance(df$c)}
    +setMethod("variance",
    +          signature(x = "Column"),
    +          function(x) {
    --- End diff --
    
    call var_samp() directly


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153998474
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155649010
  
    **[Test build #45585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45585/consoleFull)** for PR 9489 at commit [`7498e39`](https://github.com/apache/spark/commit/7498e39bc34449251aabbf1ad2f37b3fb56acf4e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155643433
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-154210731
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155649089
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45585/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155683568
  
    Merging this into master and 1.6 branch, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155633360
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45581/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155229852
  
    Should we add aliases `var` and `sd` in SparkR, which are more familiar to R users?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155643803
  
    **[Test build #45585 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45585/consoleFull)** for PR 9489 at commit [`7498e39`](https://github.com/apache/spark/commit/7498e39bc34449251aabbf1ad2f37b3fb56acf4e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9489#discussion_r44383218
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -974,6 +1006,54 @@ setMethod("soundex",
                 column(jc)
               })
     
    +#' stddev
    +#'
    +#' Aggregate function: alias for \link{stddev_samp}
    +#'
    +#' @rdname stddev
    +#' @name stddev
    +#' @family agg_funcs
    +#' @export
    +#' @examples \dontrun{stddev(df$c)}
    +setMethod("stddev",
    +          signature(x = "Column"),
    +          function(x) {
    --- End diff --
    
    call stddev_samp directly


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155632388
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155148903
  
    cc @sun-rui 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153988698
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-154210580
  
    **[Test build #45148 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45148/consoleFull)** for PR 9489 at commit [`7a2a904`](https://github.com/apache/spark/commit/7a2a90468623814c8bfc46c22a5058b350951a64).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153991207
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155154971
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155244366
  
    @mengxr possibly.. though there are usage difference with SparkR DataFrame/Column as compared to  R data.frame (eg. `agg` vs [`aggregate`](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/aggregate.html), `var(faithful$eruptions)` would have been a Column with DataFrame)
    It might be more confusing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155633335
  
    **[Test build #45581 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45581/consoleFull)** for PR 9489 at commit [`9243fa4`](https://github.com/apache/spark/commit/9243fa4b3e508a4be2932da52696db05de7180bf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155362135
  
    @shivaram, I looked at dplyr package, there is no "sd", "var" defined in the package, seems the page is referencing "sd" "var" in R base.
    
    Yes, stddev/variance agg functions we are adding have similar syntax as "sd" and "var" in R, as you can think of a Column as a vector of same type. So we'd better add aliases for them.
    
    Note that var() in R computes sample variance (I think "sd" also uses sample variance, but not verified) instead of population variance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9489#discussion_r44383333
  
    --- Diff: R/pkg/inst/tests/test_sparkSQL.R ---
    @@ -997,20 +1004,42 @@ test_that("group by", {
       expect_is(df_summarized, "DataFrame")
       expect_equal(3, count(df_summarized))
     
    -  df3 <- agg(gd, age = "sum")
    +  df3 <- agg(gd, age = "stddev")
       expect_is(df3, "DataFrame")
    -  expect_equal(3, count(df3))
    +  df3_local <- collect(df3)
    +  expect_equal(0, df3_local[df3_local$name == "Andy",][1, 2])
     
    -  df3 <- agg(gd, age = sum(df$age))
    -  expect_is(df3, "DataFrame")
    -  expect_equal(3, count(df3))
    -  expect_equal(columns(df3), c("name", "age"))
    -
    -  df4 <- sum(gd, "age")
    +  df4 <- agg(gd, sumAge = sum(df$age))
       expect_is(df4, "DataFrame")
       expect_equal(3, count(df4))
    -  expect_equal(3, count(mean(gd, "age")))
    -  expect_equal(3, count(max(gd, "age")))
    +  expect_equal(columns(df4), c("name", "sumAge"))
    +
    +  df5 <- sum(gd, "age")
    +  expect_is(df5, "DataFrame")
    +  expect_equal(3, count(df5))
    +
    +  expect_equal(3, count(mean(gd)))
    +  expect_equal(3, count(max(gd)))
    +  expect_equal(30, collect(max(gd))[1,2])
    +  expect_equal(1, collect(count(gd))[1,2])
    +
    +  mockLines2 <- c("{\"name\":\"ID1\", \"value\": \"10\"}",
    +                  "{\"name\":\"ID1\", \"value\": \"10\"}",
    +                  "{\"name\":\"ID1\", \"value\": \"22\"}",
    +                  "{\"name\":\"ID2\", \"value\": \"-3\"}")
    +  jsonPath2 <- tempfile(pattern="sparkr-test", fileext=".tmp")
    --- End diff --
    
    delete this temp file at the end of the test case


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153987351
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-154202114
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155632410
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-153998475
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45109/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9489#issuecomment-155633357
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org