You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vectorijk <gi...@git.apache.org> on 2016/05/29 10:50:16 UTC

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

GitHub user vectorijk opened a pull request:

    https://github.com/apache/spark/pull/13394

    [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API docs for non-MLib changes

    ## What changes were proposed in this pull request?
    R Docs changes
    include typos, format, layout.
    ## How was this patch tested?
    Test locally.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vectorijk/spark spark-15490

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13394.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13394
    
----
commit 7961bbe8b346ae47a70fa324b18219070197ded8
Author: Kai Jiang <ji...@gmail.com>
Date:   2016-05-29T02:09:22Z

    QA for non-MLlib changes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    @felixcheung Check out the comment above from @vectorijk about putting multiple predict methods in a single page.  Is there a better way to organize these?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65162041
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1069,7 +1080,10 @@ setMethod("first",
     #'
     #' @param x A SparkDataFrame
     #'
    -#' @noRd
    +#' @family SparkDataFrame functions
    +#' @rdname toRDD
    --- End diff --
    
    ok, I will change this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222386772
  
    **[Test build #59600 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59600/consoleFull)** for PR 13394 at commit [`7961bbe`](https://github.com/apache/spark/commit/7961bbe8b346ae47a70fa324b18219070197ded8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66030853
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -249,6 +249,10 @@ col <- function(x) {
     #'
     #' Returns a Column based on the given column name.
     #'
    +#' Though scala functions has "col" function, we don't expose it in SparkR
    --- End diff --
    
    Let's move this comment back to L241?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65483594
  
    --- Diff: R/pkg/R/stats.R ---
    @@ -135,13 +136,13 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols = "character"),
     #' Calculates the approximate quantiles of a numerical column of a SparkDataFrame.
     #'
     #' The result of this algorithm has the following deterministic bound:
    -#' If the SparkDataFrame has N elements and if we request the quantile at probability `p` up to
    -#' error `err`, then the algorithm will return a sample `x` from the SparkDataFrame so that the
    -#' *exact* rank of `x` is close to (p * N). More precisely,
    -#'   floor((p - err) * N) <= rank(x) <= ceil((p + err) * N).
    -#' This method implements a variation of the Greenwald-Khanna algorithm (with some speed
    -#' optimizations). The algorithm was first present in [[http://dx.doi.org/10.1145/375663.375670
    -#' Space-efficient Online Computation of Quantile Summaries]] by Greenwald and Khanna.
    +#' If the SparkDataFrame has N elements and if we request the quantile at probability \strong{p} up
    --- End diff --
    
    I think it would be great to leave this out and I will fix it in #13109? what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222386047
  
    **[Test build #59600 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59600/consoleFull)** for PR 13394 at commit [`7961bbe`](https://github.com/apache/spark/commit/7961bbe8b346ae47a70fa324b18219070197ded8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66363113
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    -#'
     #' Makes predictions from a generalized linear model produced by glm() or spark.glm(),
     #' similarly to R's predict().
     #'
    +#' @title predict
    --- End diff --
    
    Ok - then I'd say lets use the first line as the title convention for all our documentation. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66177097
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    -#'
     #' Makes predictions from a generalized linear model produced by glm() or spark.glm(),
     #' similarly to R's predict().
     #'
    +#' @title predict
    --- End diff --
    
    Shouldn't we follow a single convention for defining the title, either using ```@title``` or using the first sentence in the description?  @shivaram do you have a preference?  Looking at the Github history, I see lots of SparkR contributors do both.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    This LGTM now.  @shivaram @felixcheung let me know if I missed something; I'm still getting used to R doc syntax & conventions.
    I'll merge after rerunning tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222560187
  
    @vectorijk Thanks for the PR. Changes look pretty good to me. 
    We also need to update the programming guide (the one at http://spark.apache.org/docs/latest/sparkr.html) to cover the major new features. This will include 
    (a) UDFs with dapply, dapplyCollect and 
    (b) spark.lapply for running parallel R functions
    (c) the change to not require `sqlContext` 
    
    We can do that in a separate JIRA/PR or if you wish we can also do it in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60473/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65410850
  
    --- Diff: R/pkg/R/stats.R ---
    @@ -135,13 +136,13 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols = "character"),
     #' Calculates the approximate quantiles of a numerical column of a SparkDataFrame.
     #'
     #' The result of this algorithm has the following deterministic bound:
    -#' If the SparkDataFrame has N elements and if we request the quantile at probability `p` up to
    -#' error `err`, then the algorithm will return a sample `x` from the SparkDataFrame so that the
    -#' *exact* rank of `x` is close to (p * N). More precisely,
    -#'   floor((p - err) * N) <= rank(x) <= ceil((p + err) * N).
    -#' This method implements a variation of the Greenwald-Khanna algorithm (with some speed
    -#' optimizations). The algorithm was first present in [[http://dx.doi.org/10.1145/375663.375670
    -#' Space-efficient Online Computation of Quantile Summaries]] by Greenwald and Khanna.
    +#' If the SparkDataFrame has N elements and if we request the quantile at probability \strong{p} up
    --- End diff --
    
    @vectorijk Are you able to separate them?  If you can't find a good way easily, ping & I can try to figure it out.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60362/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66741566
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -249,10 +249,7 @@ col <- function(x) {
     #'
     #' Returns a Column based on the given column name.
     #'
    -#' @rdname col
    -#' @name column
     #' @family normal_funcs
    -#' @export
    --- End diff --
    
    This function is exported, right?  It's ```col``` which is not exported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #3109 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3109/consoleFull)** for PR 13394 at commit [`2537b8f`](https://github.com/apache/spark/commit/2537b8ff114f9d33d0228d7aaeabafdb454609aa).
     * This patch passes all tests.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65946137
  
    --- Diff: R/pkg/R/stats.R ---
    @@ -135,13 +136,13 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols = "character"),
     #' Calculates the approximate quantiles of a numerical column of a SparkDataFrame.
     #'
     #' The result of this algorithm has the following deterministic bound:
    -#' If the SparkDataFrame has N elements and if we request the quantile at probability `p` up to
    -#' error `err`, then the algorithm will return a sample `x` from the SparkDataFrame so that the
    -#' *exact* rank of `x` is close to (p * N). More precisely,
    -#'   floor((p - err) * N) <= rank(x) <= ceil((p + err) * N).
    -#' This method implements a variation of the Greenwald-Khanna algorithm (with some speed
    -#' optimizations). The algorithm was first present in [[http://dx.doi.org/10.1145/375663.375670
    -#' Space-efficient Online Computation of Quantile Summaries]] by Greenwald and Khanna.
    +#' If the SparkDataFrame has N elements and if we request the quantile at probability \strong{p} up
    --- End diff --
    
    SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222354732
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222354451
  
    cc @felixcheung @shivaram @sun-rui 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66029225
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2046,6 +2054,7 @@ setMethod("merge",
                 joinRes
               })
     
    +#' generateAliasesForIntersectedCols
    --- End diff --
    
    this really shouldn't have a "title" - it's not a exported function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60611/consoleFull)** for PR 13394 at commit [`84bf2aa`](https://github.com/apache/spark/commit/84bf2aadba4d3cf92ca4803cfb608283e8237ab1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #59909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59909/consoleFull)** for PR 13394 at commit [`432710e`](https://github.com/apache/spark/commit/432710e695a64bdca6c2b3d8d3866973f1662be7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222354733
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59589/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60362/consoleFull)** for PR 13394 at commit [`1d76a7f`](https://github.com/apache/spark/commit/1d76a7fe5bf53e1477a66ce493190efb5e2a2aef).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65434586
  
    --- Diff: R/pkg/R/stats.R ---
    @@ -135,13 +136,13 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols = "character"),
     #' Calculates the approximate quantiles of a numerical column of a SparkDataFrame.
     #'
     #' The result of this algorithm has the following deterministic bound:
    -#' If the SparkDataFrame has N elements and if we request the quantile at probability `p` up to
    -#' error `err`, then the algorithm will return a sample `x` from the SparkDataFrame so that the
    -#' *exact* rank of `x` is close to (p * N). More precisely,
    -#'   floor((p - err) * N) <= rank(x) <= ceil((p + err) * N).
    -#' This method implements a variation of the Greenwald-Khanna algorithm (with some speed
    -#' optimizations). The algorithm was first present in [[http://dx.doi.org/10.1145/375663.375670
    -#' Space-efficient Online Computation of Quantile Summaries]] by Greenwald and Khanna.
    +#' If the SparkDataFrame has N elements and if we request the quantile at probability \strong{p} up
    --- End diff --
    
    cc @felixcheung (we were partially discussing this in #13109)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65849144
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -628,8 +628,6 @@ setMethod("repartition",
     #'
     #' @param x A SparkDataFrame
     #' @return A StringRRDD of JSON objects
    -#' @family SparkDataFrame functions
    --- End diff --
    
    @felixcheung I removed these two lines in toJSON part. Correct me, if I am wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #59648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59648/consoleFull)** for PR 13394 at commit [`294cadd`](https://github.com/apache/spark/commit/294cadda4f790dd0e6df18501de363ce9aad0071).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66741568
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,7 +201,7 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    +#' predict
    --- End diff --
    
    No need for this change.  We can keep the longer title.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65969103
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -249,6 +249,10 @@ col <- function(x) {
     #'
     #' Returns a Column based on the given column name.
     #'
    +#' Though scala functions has "col" function, we don't expose it in SparkR
    --- End diff --
    
    Is this out of date?  Isn't ```col``` exposed now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66917105
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,7 +201,7 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    +#' predict
    --- End diff --
    
    @jkbradley The reason I want add title here is the title of current documentation is like  `Make predictions from a generalized linear model` not `predict`. 
    ![qq20160613-0 2x](https://cloud.githubusercontent.com/assets/3419881/16033524/4ab5e288-31c1-11e6-892c-c9c15258cc05.png)
    But with adding title `predict`, it looks like this
    ![qq20160613-1 2x](https://cloud.githubusercontent.com/assets/3419881/16033671/24ab8fce-31c2-11e6-86a4-7b7771c10451.png)
    So which one do you think is better?
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60030 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60030/consoleFull)** for PR 13394 at commit [`c6d516a`](https://github.com/apache/spark/commit/c6d516ada3400eecd97766552e14be60aaf4eedb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60365/consoleFull)** for PR 13394 at commit [`72bce54`](https://github.com/apache/spark/commit/72bce546f73ff42e7ebf4fe1358472b7cbcb8937).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59909/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/13394
  
    Thanks @vectorijk - I created https://issues.apache.org/jira/browse/SPARK-15672 for that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66718189
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    -#'
     #' Makes predictions from a generalized linear model produced by glm() or spark.glm(),
     #' similarly to R's predict().
     #'
    +#' @title predict
    --- End diff --
    
    Ok, I will do this in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65267172
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2514,7 +2529,9 @@ setMethod("attach",
     #' environment. Then, the given expression is evaluated in this new
     #' environment.
     #'
    +#' @title with
    --- End diff --
    
    @shivaram Is this supposed to be a long-form title, or just the name of the method?  Looking at other examples, it looks like it should be a short description


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60362/consoleFull)** for PR 13394 at commit [`1d76a7f`](https://github.com/apache/spark/commit/1d76a7fe5bf53e1477a66ce493190efb5e2a2aef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65410753
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2514,7 +2529,9 @@ setMethod("attach",
     #' environment. Then, the given expression is evaluated in this new
     #' environment.
     #'
    +#' @title with
    --- End diff --
    
    @vectorijk Could you please update the PR this way?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #3109 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3109/consoleFull)** for PR 13394 at commit [`2537b8f`](https://github.com/apache/spark/commit/2537b8ff114f9d33d0228d7aaeabafdb454609aa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65284357
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2514,7 +2529,9 @@ setMethod("attach",
     #' environment. Then, the given expression is evaluated in this new
     #' environment.
     #'
    +#' @title with
    --- End diff --
    
    @shivaram Yes, I also notice titles of other examples are not consistent. Which one should we use? Short description or just the name of the method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on the pull request:

    https://github.com/apache/spark/pull/13394
  
    @shivaram For updating the programming guide, I'd love to do this in a separate PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66719273
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -851,6 +849,8 @@ setMethod("nrow",
                 count(x)
               })
     
    +#' ncol
    --- End diff --
    
    Yes, this doesn't seem to be consistent. I reverted these since the description is sufficient to explain.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66741575
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -357,9 +361,9 @@ setMethod("summary", signature(object = "KMeansModel"),
                        cluster = cluster, is.loaded = is.loaded))
               })
     
    -#' Make predictions from a k-means model
    +#' predict
    --- End diff --
    
    ditto: keep long title


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65163283
  
    --- Diff: R/pkg/R/stats.R ---
    @@ -19,12 +19,11 @@
     
     setOldClass("jobj")
     
    -#' crosstab
    -#'
     #' Computes a pair-wise frequency table of the given columns. Also known as a contingency
     #' table. The number of distinct values for each column should be less than 1e4. At most 1e6
     #' non-zero pair frequencies will be returned.
     #'
    +#' @title Statistic functions for SparkDataFrames
    --- End diff --
    
    I will remove title here. Meanwhile, I will leave links revise here. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60361/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #59909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59909/consoleFull)** for PR 13394 at commit [`432710e`](https://github.com/apache/spark/commit/432710e695a64bdca6c2b3d8d3866973f1662be7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60030/consoleFull)** for PR 13394 at commit [`c6d516a`](https://github.com/apache/spark/commit/c6d516ada3400eecd97766552e14be60aaf4eedb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13394


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66741560
  
    --- Diff: R/pkg/R/column.R ---
    @@ -170,6 +172,8 @@ setMethod("between", signature(x = "Column"),
                 }
               })
     
    +#' cast
    +#'
     #' Casts the column to a different data type.
    --- End diff --
    
    This can remain the title, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66029121
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -851,6 +849,8 @@ setMethod("nrow",
                 count(x)
               })
     
    +#' ncol
    --- End diff --
    
    this doesn't seem consistent? in L713, or L675 we are changing doc title to something descriptive, yet here and below we are changing title to just the function name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60365/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66357796
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    -#'
     #' Makes predictions from a generalized linear model produced by glm() or spark.glm(),
     #' similarly to R's predict().
     #'
    +#' @title predict
    --- End diff --
    
    I'm pretty sure roxgen2 only picks up the first title defined for that doc/rdname so that probably won't help us much in that case...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66029758
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2766,18 +2780,21 @@ setMethod("histogram",
                 return(histStats)
               })
     
    -#' Saves the content of the SparkDataFrame to an external database table via JDBC
    +#' Save the content of DataFrame to an external database table via JDBC.
    --- End diff --
    
    ditto - "SparkDataFrame"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222572865
  
    > (c) the change to not require sqlContext
    This was in the earlier PR, under the migration guide session.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66741577
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -582,9 +586,9 @@ setMethod("summary", signature(object = "AFTSurvivalRegressionModel"),
                 return(list(coefficients = coefficients))
               })
     
    -#' Make predictions from an AFT survival regression model
    +#' predict
    --- End diff --
    
    ditto: keep long title


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65285481
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2514,7 +2529,9 @@ setMethod("attach",
     #' environment. Then, the given expression is evaluated in this new
     #' environment.
     #'
    +#' @title with
    --- End diff --
    
    I think we should follow the example of existing R packages and use the long form as the title. For example if you look at https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html the title of the page is "Fitting Generalized Linear Models"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    How about ```Predicted values based on model``` (no "object")?
    
    ```Compute histogram statistics for given column``` sounds good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Yeah I think the approach used by @vectorijk is fine. We could have the title as `Model Predictions` instead of `predict` (this is what R uses when you do `?predict`) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66029614
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2173,20 +2182,22 @@ setMethod("except",
                 dataFrame(excepted)
               })
     
    -#' Save the contents of the SparkDataFrame to a data source
    +#' Save the contents of DataFrame to a data source.
     #'
    -#' The data source is specified by the `source` and a set of options (...).
    -#' If `source` is not specified, the default data source configured by
    -#' spark.sql.sources.default will be used.
    +#' Save the contents of the SparkDataFrame to a data source. The data source is specified by the
    +#' `source` and a set of options (...). If `source` is not specified, the default data source
    --- End diff --
    
    hmm,
    ```The data source is specified by the
     `source` and a set of options (...).
    ```
    that doesn't sound quite right? Data source should be in the source parameter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    LGTM thanks! only a minor comment - in the cases where you have a short title (eg. "predict", "Histogram") can you think of a longer more descriptive title? that would help make it consistent with everything else.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60365 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60365/consoleFull)** for PR 13394 at commit [`72bce54`](https://github.com/apache/spark/commit/72bce546f73ff42e7ebf4fe1358472b7cbcb8937).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222354730
  
    **[Test build #59589 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59589/consoleFull)** for PR 13394 at commit [`7961bbe`](https://github.com/apache/spark/commit/7961bbe8b346ae47a70fa324b18219070197ded8).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65113846
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -629,8 +629,9 @@ setMethod("repartition",
     #' @param x A SparkDataFrame
     #' @return A StringRRDD of JSON objects
     #' @family SparkDataFrame functions
    -#' @rdname tojson
    -#' @noRd
    +#' @rdname toJSON
    +#' @name toJSON
    --- End diff --
    
    wait, why are we changing this from `@noRd`? this is not exported from SparkR and should not be documented.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222559900
  
    @felixcheung Could you take a look at this PR ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60611 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60611/consoleFull)** for PR 13394 at commit [`84bf2aa`](https://github.com/apache/spark/commit/84bf2aadba4d3cf92ca4803cfb608283e8237ab1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    LGTM
    Merging with master
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66719192
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2766,18 +2780,21 @@ setMethod("histogram",
                 return(histStats)
               })
     
    -#' Saves the content of the SparkDataFrame to an external database table via JDBC
    +#' Save the content of DataFrame to an external database table via JDBC.
     #'
    -#' Additional JDBC database connection properties can be set (...)
    +#' Saves the content of the SparkDataFrame to an external database table via JDBC. Additional JDBC
    --- End diff --
    
    I also think we should consociate the plurality of first word which is verbal in description


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60361/consoleFull)** for PR 13394 at commit [`7216ba1`](https://github.com/apache/spark/commit/7216ba1a2b72d680a14f280079f7a08cbce75a32).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222386805
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59600/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Thanks! @jkbradley @felixcheung @shivaram Sure. How about use title `Predicted values based on model object` instead of using `predict` (like [https://stat.ethz.ch/R-manual/R-devel/library/stats/html/predict.lm.html](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/predict.lm.html))
    and use title `Compute histogram statistics for given column` instead of `Histogram`  ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65268646
  
    --- Diff: R/pkg/R/stats.R ---
    @@ -135,13 +136,13 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols = "character"),
     #' Calculates the approximate quantiles of a numerical column of a SparkDataFrame.
     #'
     #' The result of this algorithm has the following deterministic bound:
    -#' If the SparkDataFrame has N elements and if we request the quantile at probability `p` up to
    -#' error `err`, then the algorithm will return a sample `x` from the SparkDataFrame so that the
    -#' *exact* rank of `x` is close to (p * N). More precisely,
    -#'   floor((p - err) * N) <= rank(x) <= ceil((p + err) * N).
    -#' This method implements a variation of the Greenwald-Khanna algorithm (with some speed
    -#' optimizations). The algorithm was first present in [[http://dx.doi.org/10.1145/375663.375670
    -#' Space-efficient Online Computation of Quantile Summaries]] by Greenwald and Khanna.
    +#' If the SparkDataFrame has N elements and if we request the quantile at probability \strong{p} up
    --- End diff --
    
    @shivaram  Looking at the doc page for statfunctions, a lot of functions are being mushed together.  E.g., "col1" and "col2" appear under the "Arguments" section many times.  What is the best way to separate these methods?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60364 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60364/consoleFull)** for PR 13394 at commit [`e096d58`](https://github.com/apache/spark/commit/e096d588ccb81a182bbad43c8010336c86afccd4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66182476
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    -#'
     #' Makes predictions from a generalized linear model produced by glm() or spark.glm(),
     #' similarly to R's predict().
     #'
    +#' @title predict
    --- End diff --
    
    I think people have been trying to be consistent within the given R source file but total agree there are many different forms in used and it would be nice to have a single format for all R sources.
    
    
    		
    
    	


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59648/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222386802
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65113954
  
    --- Diff: R/pkg/R/stats.R ---
    @@ -19,12 +19,11 @@
     
     setOldClass("jobj")
     
    -#' crosstab
    -#'
     #' Computes a pair-wise frequency table of the given columns. Also known as a contingency
     #' table. The number of distinct values for each column should be less than 1e4. At most 1e6
     #' non-zero pair frequencies will be returned.
     #'
    +#' @title Statistic functions for SparkDataFrames
    --- End diff --
    
    please see PR #13109


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66028863
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -238,9 +238,9 @@ setMethod("ceil",
                 column(jc)
               })
     
    -#' Though scala functions has "col" function, we don't expose it in SparkR
    -#' because we don't want to conflict with the "col" function in the R base
    -#' package and we also have "column" function exported which is an alias of "col".
    +#' @rdname col
    +#' @name column
    +#' @export
    --- End diff --
    
    this is not exported - it should not have `@rdname`, `@name` or `@export` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66299402
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    -#'
     #' Makes predictions from a generalized linear model produced by glm() or spark.glm(),
     #' similarly to R's predict().
     #'
    +#' @title predict
    --- End diff --
    
    I personally prefer using the first sentence as the convention to define the title. One thing though is I'm not sure how the titles get collated when we put multiple functions in the same doc page. @felixcheung do you know if using `@title` would make it easier to have a canonical title in that case ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r67223092
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -197,7 +201,7 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) {
       invisible(x)
       }
     
    -#' Make predictions from a generalized linear model
    +#' predict
    --- End diff --
    
    Ohh, I see.  Thanks, yes, I think it's fine as you have it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #59648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59648/consoleFull)** for PR 13394 at commit [`294cadd`](https://github.com/apache/spark/commit/294cadda4f790dd0e6df18501de363ce9aad0071).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66718939
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2766,18 +2780,21 @@ setMethod("histogram",
                 return(histStats)
               })
     
    -#' Saves the content of the SparkDataFrame to an external database table via JDBC
    +#' Save the content of DataFrame to an external database table via JDBC.
     #'
    -#' Additional JDBC database connection properties can be set (...)
    +#' Saves the content of the SparkDataFrame to an external database table via JDBC. Additional JDBC
    --- End diff --
    
    It should be same and I changed both these two to plural.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60364 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60364/consoleFull)** for PR 13394 at commit [`e096d58`](https://github.com/apache/spark/commit/e096d588ccb81a182bbad43c8010336c86afccd4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66019351
  
    --- Diff: R/pkg/R/functions.R ---
    @@ -249,6 +249,10 @@ col <- function(x) {
     #'
     #' Returns a Column based on the given column name.
     #'
    +#' Though scala functions has "col" function, we don't expose it in SparkR
    --- End diff --
    
    No. "col" is not exposed in SparkR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65418448
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2514,7 +2529,9 @@ setMethod("attach",
     #' environment. Then, the given expression is evaluated in this new
     #' environment.
     #'
    +#' @title with
    --- End diff --
    
    @jkbradley I will do it ASAP.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60364/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65113870
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1069,7 +1080,10 @@ setMethod("first",
     #'
     #' @param x A SparkDataFrame
     #'
    -#' @noRd
    +#' @family SparkDataFrame functions
    +#' @rdname toRDD
    --- End diff --
    
    same here. RDD functions are not exported and not supported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65691008
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1069,6 +1079,8 @@ setMethod("first",
     #'
     #' @param x A SparkDataFrame
     #'
    +#' @family SparkDataFrame functions
    +#' @rdname tordd
    --- End diff --
    
    @felixcheung ok, should we also remove these two lines in `toJSON` part in line 631?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66030066
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -2766,18 +2780,21 @@ setMethod("histogram",
                 return(histStats)
               })
     
    -#' Saves the content of the SparkDataFrame to an external database table via JDBC
    +#' Save the content of DataFrame to an external database table via JDBC.
     #'
    -#' Additional JDBC database connection properties can be set (...)
    +#' Saves the content of the SparkDataFrame to an external database table via JDBC. Additional JDBC
    --- End diff --
    
    I think we have had similar discussions in another PR, but I'm not sure I follow the reasoning for L2783 to be "Save the content of" and L2785 to be "Saves the content of" - why is one singular another plural?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60473/consoleFull)** for PR 13394 at commit [`2537b8f`](https://github.com/apache/spark/commit/2537b8ff114f9d33d0228d7aaeabafdb454609aa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60611/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222378580
  
    Jenkins test this again please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222385824
  
    Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60030/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r66741569
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -218,9 +222,9 @@ setMethod("predict", signature(object = "GeneralizedLinearRegressionModel"),
                 return(dataFrame(callJMethod(object@jobj, "transform", newData@sdf)))
               })
     
    -#' Make predictions from a naive Bayes model
    +#' predict
    --- End diff --
    
    Same here: keep longer title


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60473/consoleFull)** for PR 13394 at commit [`2537b8f`](https://github.com/apache/spark/commit/2537b8ff114f9d33d0228d7aaeabafdb454609aa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13394
  
    **[Test build #60361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60361/consoleFull)** for PR 13394 at commit [`7216ba1`](https://github.com/apache/spark/commit/7216ba1a2b72d680a14f280079f7a08cbce75a32).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13394#issuecomment-222354455
  
    **[Test build #59589 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59589/consoleFull)** for PR 13394 at commit [`7961bbe`](https://github.com/apache/spark/commit/7961bbe8b346ae47a70fa324b18219070197ded8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65959598
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -647,7 +645,7 @@ setMethod("toJSON",
                 RDD(jrdd, serializedMode = "string")
               })
     
    -#' write.json
    +#' Save the contents of DataFrame as a JSON file
    --- End diff --
    
    Can we use SparkDataFrame as opposed to DataFrame (see https://issues.apache.org/jira/browse/SPARK-12148 for some more details).  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r65656790
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1069,6 +1079,8 @@ setMethod("first",
     #'
     #' @param x A SparkDataFrame
     #'
    +#' @family SparkDataFrame functions
    +#' @rdname tordd
    --- End diff --
    
    let's revert these 2 lines as well? thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

Posted by vectorijk <gi...@git.apache.org>.
Github user vectorijk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13394#discussion_r67259794
  
    --- Diff: R/pkg/R/mllib.R ---
    @@ -402,6 +406,8 @@ setMethod("spark.naiveBayes", signature(data = "SparkDataFrame", formula = "form
             return(new("NaiveBayesModel", jobj = jobj))
         })
     
    +#' Save fitted MLlib model to the input path
    --- End diff --
    
    @jkbradley Likewise, I changed title `write.ml` to `Save fitted MLlib model to the input path` rather than `Save the Bernoulli naive Bayes model to the input path.` for all four different models.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org