You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by hhbyyh <gi...@git.apache.org> on 2017/02/16 22:45:22 UTC

[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examp...

GitHub user hhbyyh opened a pull request:

    https://github.com/apache/spark/pull/16968

    [SPARK-19337] [ML] [Dcoc] Documentation and examples for LinearSVC

    ## What changes were proposed in this pull request?
    
    Documentation and examples (Java, scala, python, R) for LinearSVC
    
    ## How was this patch tested?
    local doc generation


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hhbyyh/spark mlsvmdoc

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16968.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16968
    
----
commit 7a0829f99fa9f7f261362a182caecad171d5ab78
Author: Yuhao Yang <yu...@intel.com>
Date:   2017-02-16T22:38:30Z

    linearsvc doc and example

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101840341
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,51 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data point of any class (so-called functional margin),
    --- End diff --
    
    Thanks for the comment. I think both large and long can be used to describe distance, wherever large is more suitable to describe the numeric margin. Please let me know if you have a strong preference. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101872717
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,44 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data points of any class (so-called functional margin),
    +since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
    +in Spark ML supports binomial classification with linear SVM. Internally, it optimizes the 
    --- End diff --
    
    just to be consistent with LR. But I'm not sure if it's the common expression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73132/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    **[Test build #73071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73071/testReport)** for PR 16968 at commit [`b888f35`](https://github.com/apache/spark/commit/b888f35372532e0766839068e0827454afed10aa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101851724
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,44 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data points of any class (so-called functional margin),
    +since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
    +in Spark ML supports binomial classification with linear SVM. Internally, it optimizes the 
    --- End diff --
    
    actually, is there a reason you change this to `binomial classification`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Thanks for the comment @felixcheung 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examples for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101876999
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,44 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data points of any class (so-called functional margin),
    +since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
    +in Spark ML supports binomial classification with linear SVM. Internally, it optimizes the 
    --- End diff --
    
    do you have a link? I think binary classification is more commonly used


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101840357
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,51 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data point of any class (so-called functional margin),
    +since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
    +in Spark ML supports binary calssification with linear SVM. Internally, it optimizes the 
    --- End diff --
    
    Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73071/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101890081
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,44 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data points of any class (so-called functional margin),
    +since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
    +in Spark ML supports binomial classification with linear SVM. Internally, it optimizes the 
    --- End diff --
    
    FWIW I have never head the term binomial classification and it doesn't show up in a Google search. I think it was a typo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examples for...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    I see. I will drop the R example here, whichever PR goes in later can finish the document update. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    **[Test build #73132 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73132/testReport)** for PR 16968 at commit [`165fbe4`](https://github.com/apache/spark/commit/165fbe430e691124a87dde9862df7b3ba3e3d4a2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101683048
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,51 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data point of any class (so-called functional margin),
    +since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
    +in Spark ML supports binary calssification with linear SVM. Internally, it optimizes the 
    --- End diff --
    
    calssification -> classification


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    **[Test build #73071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73071/testReport)** for PR 16968 at commit [`b888f35`](https://github.com/apache/spark/commit/b888f35372532e0766839068e0827454afed10aa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/16968


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examples for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73020/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examples for...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    **[Test build #73020 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73020/testReport)** for PR 16968 at commit [`7a0829f`](https://github.com/apache/spark/commit/7a0829f99fa9f7f261362a182caecad171d5ab78).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `public class JavaLinearSVCExample `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examp...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101683098
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,51 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data point of any class (so-called functional margin),
    --- End diff --
    
    "largest distance" -> "longest distance"? I think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examples for...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    title should say
    `[SPARK-19337] [ML] [Dcoc]`
    ->
    `[SPARK-19337] [ML] [Doc]`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    Thanks for the review. Updated to binary. 
    Also add the reference to R example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Dcoc] Documentation and examples for...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    **[Test build #73020 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73020/testReport)** for PR 16968 at commit [`7a0829f`](https://github.com/apache/spark/commit/7a0829f99fa9f7f261362a182caecad171d5ab78).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16968: [SPARK-19337] [ML] [Doc] Documentation and examples for ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16968
  
    **[Test build #73132 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73132/testReport)** for PR 16968 at commit [`165fbe4`](https://github.com/apache/spark/commit/165fbe430e691124a87dde9862df7b3ba3e3d4a2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16968: [SPARK-19337] [ML] [Doc] Documentation and exampl...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16968#discussion_r101924719
  
    --- Diff: docs/ml-classification-regression.md ---
    @@ -363,6 +363,44 @@ Refer to the [R API docs](api/R/spark.mlp.html) for more details.
     
     </div>
     
    +## Linear Support Vector Machine
    +
    +A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
    +or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
    +regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
    +the largest distance to the nearest training-data points of any class (so-called functional margin),
    +since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
    +in Spark ML supports binomial classification with linear SVM. Internally, it optimizes the 
    --- End diff --
    
    yes, let's fix that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org