You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zhengruifeng <gi...@git.apache.org> on 2017/05/23 05:47:47 UTC

[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

GitHub user zhengruifeng opened a pull request:

    https://github.com/apache/spark/pull/18067

    [SPARK-20849][DOC][SPARKR]  Document R DecisionTree

    ## What changes were proposed in this pull request?
    1, add an example for sparkr `decisionTree`
    2, document it in user guide
    
    ## How was this patch tested?
    local submit


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zhengruifeng/spark dt_example

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18067.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18067
    
----
commit 3d8172f98f0994fec9ff359dfca4e6fcddd85863
Author: Zheng RuiFeng <ru...@foxmail.com>
Date:   2017-05-23T03:56:20Z

    create pr

commit def3ef4635094955c20c7e9511ce681378794d34
Author: Zheng RuiFeng <ru...@foxmail.com>
Date:   2017-05-23T04:33:33Z

    update vignettes

commit f43ebe03115b0b22ed01b76925312dfbc7a2c8c0
Author: Zheng RuiFeng <ru...@foxmail.com>
Date:   2017-05-23T05:44:44Z

    update sparkr.md

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18067#discussion_r118411586
  
    --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
    @@ -430,7 +430,7 @@ We use `svm` in package `e1071` as an example. We use all default settings excep
     costs <- exp(seq(from = log(1), to = log(1000), length.out = 5))
     train <- function(cost) {
       stopifnot(requireNamespace("e1071", quietly = TRUE))
    -  model <- e1071::svm(Species ~ ., data = iris, cost = cost)
    +  model <- e1071::svm(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris, cost = cost)
    --- End diff --
    
    this isn't reverted?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77325/testReport)** for PR 18067 at commit [`f413081`](https://github.com/apache/spark/commit/f4130816f9fc2a3771e080c3811312eabfe5bc81).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    why change to classification for trees?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77324/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77349/testReport)** for PR 18067 at commit [`48a9686`](https://github.com/apache/spark/commit/48a968667cdce59e3d8713220160a3d96b20afcd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    @felixcheung  just because dataset `Titanic` is often used to illustrate classification. Like the usage in Kaggle contest[https://www.kaggle.com/c/titanic#evaluation].



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18067#discussion_r117906523
  
    --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
    @@ -776,6 +778,19 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))
     head(predict(isoregModel, newDF))
     ```
     
    +#### Decision Tree
    +
    +`spark.decisionTree` fits a [decision tree](https://en.wikipedia.org/wiki/Decision_tree_learning) classification or regression model on a `SparkDataFrame`.
    +Users can call `summary` to get a summary of the fitted model, `predict` to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
    +
    +We use the `longley` dataset to train a decision tree and make predictions:
    +
    +```{r, warning=FALSE}
    +df <- createDataFrame(longley)
    --- End diff --
    
    I'd say try to use a data set without `.` in column name if you can.
    Probably would be confusion when examples are causing warnings when users run them 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77351/testReport)** for PR 18067 at commit [`1a97e42`](https://github.com/apache/spark/commit/1a97e42ea9305a043eccead7013ad35e9aa89f91).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Jenkins, retests this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18067#discussion_r118411791
  
    --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
    @@ -776,6 +778,20 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))
     head(predict(isoregModel, newDF))
     ```
     
    +#### Decision Tree
    +
    +`spark.decisionTree` fits a [decision tree](https://en.wikipedia.org/wiki/Decision_tree_learning) classification or regression model on a `SparkDataFrame`.
    +Users can call `summary` to get a summary of the fitted model, `predict` to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
    +
    +We use the `longley` dataset to train a decision tree and make predictions:
    +
    +```{r}
    +df <- createDataFrame(longley)
    --- End diff --
    
    as commented, before, please check. I'm pretty sure `createDataFrame(longley)` will cause a warning
    ```
    longley
         GNP.deflator     GNP Unemployed Armed.Forces Population Year Employed
    1947         83.0 234.289      235.6        159.0    107.608 1947   60.323
    1948         88.5 259.426      232.5        145.6    108.632 1948   61.122
    ```
    so our options are:
    - don't use longley (my earlier suggestion)
    - use longley but keep `warning=FALSE`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18067#discussion_r118315143
  
    --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
    @@ -776,6 +778,19 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))
     head(predict(isoregModel, newDF))
     ```
     
    +#### Decision Tree
    +
    +`spark.decisionTree` fits a [decision tree](https://en.wikipedia.org/wiki/Decision_tree_learning) classification or regression model on a `SparkDataFrame`.
    +Users can call `summary` to get a summary of the fitted model, `predict` to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
    +
    +We use the `longley` dataset to train a decision tree and make predictions:
    +
    +```{r, warning=FALSE}
    +df <- createDataFrame(longley)
    --- End diff --
    
    actually, I think there's a confusion - I don't mean to change not to use `.` in formula
    I mean the reason why we have `warning=FALSE` here is because `createDataFrame(longley)` will cause a warning because it has column with name with `.` in it. And we should avoid that if we can


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77226 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77226/testReport)** for PR 18067 at commit [`f43ebe0`](https://github.com/apache/spark/commit/f43ebe03115b0b22ed01b76925312dfbc7a2c8c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18067#discussion_r118423872
  
    --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
    @@ -776,6 +778,20 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))
     head(predict(isoregModel, newDF))
     ```
     
    +#### Decision Tree
    +
    +`spark.decisionTree` fits a [decision tree](https://en.wikipedia.org/wiki/Decision_tree_learning) classification or regression model on a `SparkDataFrame`.
    +Users can call `summary` to get a summary of the fitted model, `predict` to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
    +
    +We use the `longley` dataset to train a decision tree and make predictions:
    +
    +```{r}
    +df <- createDataFrame(longley)
    --- End diff --
    
    option 2: do you mean using ````{r, warning=FALSE}` like other examples?
    I think both are OK,.
    which do you prefer?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77230 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77230/testReport)** for PR 18067 at commit [`65cf494`](https://github.com/apache/spark/commit/65cf494a0f432c23ea83bc532942bb9c84febaaa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77226/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77351/testReport)** for PR 18067 at commit [`1a97e42`](https://github.com/apache/spark/commit/1a97e42ea9305a043eccead7013ad35e9aa89f91).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77324 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77324/testReport)** for PR 18067 at commit [`76a9726`](https://github.com/apache/spark/commit/76a97260c7db9ce3aeff7e1c2b120a38f7d7fb22).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    @felixcheung Updated. By the way, I update other formulas in `sparkr-vignettes.Rmd`. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77351/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77230/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77239/testReport)** for PR 18067 at commit [`65cf494`](https://github.com/apache/spark/commit/65cf494a0f432c23ea83bc532942bb9c84febaaa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77325 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77325/testReport)** for PR 18067 at commit [`f413081`](https://github.com/apache/spark/commit/f4130816f9fc2a3771e080c3811312eabfe5bc81).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77349/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18067


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77239/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77239/testReport)** for PR 18067 at commit [`65cf494`](https://github.com/apache/spark/commit/65cf494a0f432c23ea83bc532942bb9c84febaaa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77349 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77349/testReport)** for PR 18067 at commit [`48a9686`](https://github.com/apache/spark/commit/48a968667cdce59e3d8713220160a3d96b20afcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77226 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77226/testReport)** for PR 18067 at commit [`f43ebe0`](https://github.com/apache/spark/commit/f43ebe03115b0b22ed01b76925312dfbc7a2c8c0).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77325/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18067#discussion_r118425359
  
    --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
    @@ -776,6 +778,20 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))
     head(predict(isoregModel, newDF))
     ```
     
    +#### Decision Tree
    +
    +`spark.decisionTree` fits a [decision tree](https://en.wikipedia.org/wiki/Decision_tree_learning) classification or regression model on a `SparkDataFrame`.
    +Users can call `summary` to get a summary of the fitted model, `predict` to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
    +
    +We use the `longley` dataset to train a decision tree and make predictions:
    +
    +```{r}
    +df <- createDataFrame(longley)
    --- End diff --
    
    yes - but as mentioned, if you can think of a data set that doesn't have dot in column name, like `as.data.frame(Titanic)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    **[Test build #77324 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77324/testReport)** for PR 18067 at commit [`76a9726`](https://github.com/apache/spark/commit/76a97260c7db9ce3aeff7e1c2b120a38f7d7fb22).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18067
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org