You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by yanboliang <gi...@git.apache.org> on 2016/05/23 13:40:58 UTC

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

GitHub user yanboliang opened a pull request:

    https://github.com/apache/spark/pull/13262

    [SPARK-11959] [SPARK-15484] [Doc] [ML] Document WLS and IRLS

    ## What changes were proposed in this pull request?
    Document WLS and IRLS.
    
    ## How was this patch tested?
    Document update, no tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yanboliang/spark spark-15484

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13262
    
----
commit 47107b2f25f3c805e3c10d0d99dfe29359e76c5e
Author: Yanbo Liang <yb...@gmail.com>
Date:   2016-05-23T13:39:20Z

    Document WLS and IRLS

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64601181
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +MLlib implements [iteratively reweighted least squares (IRLS)](https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    --- End diff --
    
    I had understood "developer" to designate developers of Spark applications, not developers who contribute to the core Apache Spark project. For example, we designate some public APIs as "developer API". At any rate, I'm fine leaving this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64229161
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory of a single machine, and then we can solve the objective function through Cholesky factorization on driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements the method of iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    --- End diff --
    
    The maximum likelihood...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64540348
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    --- End diff --
    
    We will switch RDD-based MLlib APIs to maintenance mode in Spark 2.0. MLlib mainly refer to the DataFrame-based API later, so I think it's OK to use MLlib.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13262


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-220996815
  
    @BenFradet Thanks for your comments, updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221219662
  
    **[Test build #59198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59198/consoleFull)** for PR 13262 at commit [`7f80d78`](https://github.com/apache/spark/commit/7f80d781ab2a4094f46f6eb322879ef4506f8ac0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64446975
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    --- End diff --
    
    nit: "L1 and elastic net regularization"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64228725
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory of a single machine, and then we can solve the objective function through Cholesky factorization on driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements the method of iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems by an iterative way:
    --- End diff --
    
    By an iterative way -> iteratively


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64447607
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    --- End diff --
    
    minor: "Quasi-Newton"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64361895
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    --- End diff --
    
    I think limit communication is only one of the reasons, the most important one is that we can not solve Cholesky factorization in a single machine if the number of features more than 4096.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64602162
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    --- End diff --
    
    So this is supposed to be the actual `String` value they should set the solver param to if they want to use it? I don't see the point of including that here since we document in the API param docs, and it could change and leave the user guide out of sync. Further, I don't think most people will realize that is what those mean. It might be better to put them in quotes (e.g. "normal"), but I'd vote for not having them at all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64226849
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    --- End diff --
    
    Rapider -> faster


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64361255
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    --- End diff --
    
    There is no docs for breeze LBFGS, the Wikipedia page for L-BFGS is linked above. I think it makes sense to link code here because most audience of the optimization session is developers or expert users.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252074
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems iteratively:
    +
    +* linearize the objective at current solution and update corresponding weight.
    +* solve a weighted least squares (WLS) problem by WeightedLeastSquares.
    +* repeat above steps until convergence.
    +
    +Due to it involves solving a weighted least squares (WLS) problem by WeightedLeastSquares in each step of the iteration,
    +it also only supports the number of features is no more than 4096.
    --- End diff --
    
    "also only supports the number of features is no more than 4096" --> "also requires the number of features to be no more than 4096"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64450283
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    --- End diff --
    
    Ah, well I see this was suggested above. I think it's a bit confusing to say "MLlib", but I will defer to others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64423822
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +MLlib implements [iteratively reweighted least squares (IRLS)](https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems iteratively:
    +
    +* linearize the objective at current solution and update corresponding weight.
    +* solve a weighted least squares (WLS) problem by WeightedLeastSquares.
    +* repeat above steps until convergence.
    +
    +Since it involves solving a weighted least squares (WLS) problem by WeightedLeastSquares in each iteration,
    --- End diff --
    
    While this is true, it does not provide any sort of explanation as to _why_ that restriction exists. I like the idea of explaining that the covariance matrix can fit into main memory with < 4096 features (usually). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221228531
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59198/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64757853
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
    -QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +Quasi-Newton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 and elastic net regularization.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{k=1}^n w_k} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    --- End diff --
    
    Using `\min_x` is probably more appropriate notation. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221524773
  
    **[Test build #59271 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59271/consoleFull)** for PR 13262 at commit [`1364fc7`](https://github.com/apache/spark/commit/1364fc75290e23ed1191b66236734482e1dd3bcc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221354484
  
    @yanboliang Thanks for the updates.  I responded above about a tiny fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64447478
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    --- End diff --
    
    Also, this sentence seems out of place. We don't mention that it is also used in some Spark algorithms or where it is used. Perhaps we can give it its own section, or otherwise just make a small mention that it is a variant of LBFGS that spark uses for L1 regularization algorithms.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-222243031
  
    LGTM
    Merging with master and branch-2.0
    Thanks @yanboliang @BenFradet @sethah !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64443848
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    --- End diff --
    
    OK, sounds fine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64541660
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    --- End diff --
    
    Yes, we do not expose L-BFGS api, but we exposed the param ```solver``` which can control the optimization method used to solve the linear model.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64227878
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    --- End diff --
    
    One pass over the data to collect the necessary statistics to solve it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64227729
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    --- End diff --
    
    Distributed storage


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252067
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    --- End diff --
    
    "spark.ml" --> "The DataFrames-based API" or just "MLlib"
    Also, how about a Wikipedia link instead of source code?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221524890
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221865324
  
    **[Test build #59381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59381/consoleFull)** for PR 13262 at commit [`23f1d7b`](https://github.com/apache/spark/commit/23f1d7b50ebbef96aa5b2048ce855802b242f96f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64536206
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    --- End diff --
    
    Sounds good, update.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221875430
  
    **[Test build #59381 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59381/consoleFull)** for PR 13262 at commit [`23f1d7b`](https://github.com/apache/spark/commit/23f1d7b50ebbef96aa5b2048ce855802b242f96f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64226662
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    --- End diff --
    
    ![image](https://cloud.githubusercontent.com/assets/1962026/15473185/a37f5e68-20b3-11e6-8c21-9bf3b6d204c4.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64603279
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
    -QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +Quasi-Newton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 and elastic net regularization.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{k=1}^n w_k} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in a distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    --- End diff --
    
    This isn't true in general. We should say that these statistics can be loaded into memory _if the number of features is relatively small_. It might be good to indicate that the size of these statistics depends on `numFeatures^2`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221875589
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221523746
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64227280
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    --- End diff --
    
    The spark.ml ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64541319
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    --- End diff --
    
    We have different ```solver``` for linear models for users to set, and the words in brackets is the name of the solver.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64228369
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory of a single machine, and then we can solve the objective function through Cholesky factorization on driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    --- End diff --
    
    Only supports 4096 or less features


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221222850
  
    @jkbradley Thanks for your comments. I think most of the audiences to the optimization session are developers and expert users, so I linked both Wikipedia docs and source code to the corresponding concept. Any more comments, please feel free to let me know.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64231083
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory of a single machine, and then we can solve the objective function through Cholesky factorization on driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements the method of iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems by an iterative way:
    +
    +* linearize the objective at current solution and update corresponding weight.
    --- End diff --
    
    I'm not too sure about this sentence, would it be: Add  linearize the objective function at the current solution and update the corresponding weights?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252057
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    --- End diff --
    
    Link to their docs, rather than code?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252059
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    --- End diff --
    
    I'd prefer a link to Wikipedia than the source code, or both if you want to keep the source link.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64602664
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
    -QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +Quasi-Newton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 and elastic net regularization.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{k=1}^n w_k} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    --- End diff --
    
    minor: "the label"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64443858
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    --- End diff --
    
    True, it's also for the decomposition.  I'd rephrase at least: "In order to take the normal equation approach efficiently" --> "In order to make the normal equation approach efficient"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64444684
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    --- End diff --
    
    We should replace "MLlib" with `spark.ml` or similar here and elsewhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252055
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    --- End diff --
    
    "when computing" ---> "unlike computing"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221006039
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64227193
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    --- End diff --
    
    Is used as a solver


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252077
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems iteratively:
    +
    +* linearize the objective at current solution and update corresponding weight.
    +* solve a weighted least squares (WLS) problem by WeightedLeastSquares.
    +* repeat above steps until convergence.
    +
    +Due to it involves solving a weighted least squares (WLS) problem by WeightedLeastSquares in each step of the iteration,
    +it also only supports the number of features is no more than 4096.
    +Currently IRLS was used as the default solver of [GeneralizedLinearRegression](api/scala/index.html#org.apache.spark.ml.regression.GeneralizedLinearRegression).
    --- End diff --
    
    "was" --> "is"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-220994051
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221524891
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59271/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221945362
  
    I think it's OK to leave the duplicate text in the MLlib guide.  It will just sit there unchanging and should not require extra effort to maintain.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252070
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems iteratively:
    +
    +* linearize the objective at current solution and update corresponding weight.
    +* solve a weighted least squares (WLS) problem by WeightedLeastSquares.
    +* repeat above steps until convergence.
    +
    +Due to it involves solving a weighted least squares (WLS) problem by WeightedLeastSquares in each step of the iteration,
    --- End diff --
    
    "Due to it" --> "Since it"
    "each step of the iteration" --> "each iteration"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64453129
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    --- End diff --
    
    minor: "in distributed fashion" or "in a distributed system".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64541027
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +MLlib implements [iteratively reweighted least squares (IRLS)](https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    --- End diff --
    
     I think most audience of this session is developers rather than users as you see in the title ```Optimization of linear methods (developer)```, so I explain the design considerations behind the IRLS implementation. Developers who have ideas to implement algorithms based on IRLS can contribute to Spark MLlib.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221228416
  
    **[Test build #59198 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59198/consoleFull)** for PR 13262 at commit [`7f80d78`](https://github.com/apache/spark/commit/7f80d781ab2a4094f46f6eb322879ef4506f8ac0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64452968
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    --- End diff --
    
    Does it make sense to direct readers to "use L-BFGS" instead? Spark doesn't expose an L-BFGS api nor does it even expose the normal equation solver to users. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-220997417
  
    **[Test build #59129 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59129/consoleFull)** for PR 13262 at commit [`a9cbf55`](https://github.com/apache/spark/commit/a9cbf55e0741176fd117746f4663a5d0aae6920a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-220994054
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59128/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64450888
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +MLlib implements [iteratively reweighted least squares (IRLS)](https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems iteratively:
    +
    +* linearize the objective at current solution and update corresponding weight.
    +* solve a weighted least squares (WLS) problem by WeightedLeastSquares.
    +* repeat above steps until convergence.
    +
    +Since it involves solving a weighted least squares (WLS) problem by WeightedLeastSquares in each iteration,
    --- End diff --
    
    Maybe I was looking at an outdated diff before, I see now that you've explained this above. Maybe add a "see above for further information on this constraint" ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64445481
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +MLlib implements [iteratively reweighted least squares (IRLS)](https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    --- End diff --
    
    What is the point of explaining what IRLS _could_ be used for, even though it is not used for that in Spark? If it were a public interface, I could understand giving examples of other ways it could be used, but as things are currently I'm not sure this is necessary. If anything, it might seem like we're suggesting users can utilize IRLS for these other problems, but we don't provide a public API for it, which is misleading.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-220983670
  
    **[Test build #59128 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59128/consoleFull)** for PR 13262 at commit [`47107b2`](https://github.com/apache/spark/commit/47107b2f25f3c805e3c10d0d99dfe29359e76c5e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64226613
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    --- End diff --
    
    ![image](https://cloud.githubusercontent.com/assets/1962026/15473179/96fb8e14-20b3-11e6-890e-b8b156f58ee3.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64228524
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory of a single machine, and then we can solve the objective function through Cholesky factorization on driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +`spark.ml` implements the method of iteratively reweighted least squares (IRLS) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    --- End diff --
    
    I'd remove "the method"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64228076
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory of a single machine, and then we can solve the objective function through Cholesky factorization on driver.
    --- End diff --
    
    "On a single machine" or "the memory of a single machine"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221875592
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59381/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64252064
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +The `spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares only supports the number of features is no more than 4096.
    --- End diff --
    
    In order to limit communication, WeightedLeastSquares requires that the number of features be no more than 4096.  For larger problems, use L-BFGS instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221514935
  
    **[Test build #59270 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59270/consoleFull)** for PR 13262 at commit [`2bc030c`](https://github.com/apache/spark/commit/2bc030cccafaf95cde5486bad8503a74cebc8a38).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64448057
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +MLlib implements [iteratively reweighted least squares (IRLS)](https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems iteratively:
    --- End diff --
    
    "It solves certain optimization problems iteratively through the following procedure:" ? It makes it more clear what the bullet points below are explaining.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221523749
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59270/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221892125
  
    Should we point the section in MLlib to the section here now? Otherwise we leave some text that is duplicated verbatim in the user guides. 
    
    One other minor comment and this LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64452601
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    --- End diff --
    
    What is the "(normal)" supposed to designate?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221006071
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59129/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221521088
  
    @sethah @jkbradley Thanks for your comments! I addressed most of the comments and responded inline.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221516131
  
    **[Test build #59271 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59271/consoleFull)** for PR 13262 at commit [`1364fc7`](https://github.com/apache/spark/commit/1364fc75290e23ed1191b66236734482e1dd3bcc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221523625
  
    **[Test build #59270 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59270/consoleFull)** for PR 13262 at commit [`2bc030c`](https://github.com/apache/spark/commit/2bc030cccafaf95cde5486bad8503a74cebc8a38).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221032845
  
    Done with pass


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-220993006
  
    @yanboliang lgtm except for a few phrasing issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64538677
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
    -QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +Quasi-Newton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 and elastic net regularization.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{k=1}^n w_k} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    --- End diff --
    
    ![image](https://cloud.githubusercontent.com/assets/1962026/15534277/ec16b760-221c-11e6-8b22-0f4a6ac31a7d.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221005962
  
    **[Test build #59129 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59129/consoleFull)** for PR 13262 at commit [`a9cbf55`](https://github.com/apache/spark/commit/a9cbf55e0741176fd117746f4663a5d0aae6920a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64226913
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
    --- End diff --
    
    Optimizations


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64447821
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    --- End diff --
    
    I don't really have a preference for "re-weighted" vs "reweighted", but we should be consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221015861
  
    Taking a final look


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-221228530
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13262#issuecomment-220993917
  
    **[Test build #59128 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59128/consoleFull)** for PR 13262 at commit [`47107b2`](https://github.com/apache/spark/commit/47107b2f25f3c805e3c10d0d99dfe29359e76c5e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by sethah <gi...@git.apache.org>.

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64453434
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    --- End diff --
    
    This sentence is quite confusing to me. What is the "standardizing features and labels" part supposed to mean? Should it read: "... and provides options to enable or disable regularization and standardization" ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by yanboliang <gi...@git.apache.org>.

Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64539181
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) unlike computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves faster convergence compared with 
    +other first-order optimizations.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS is used as a solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +MLlib L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +MLlib implements normal equation solver for [weighted least squares](https://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares) by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory on a single machine, and then we can solve the objective function through Cholesky factorization on the driver.
    +
    +WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization, standardizing features and labels.
    +In order to take the normal equation approach efficiently, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead.
    +
    +## Iteratively re-weighted least squares (IRLS)
    +
    +MLlib implements [iteratively reweighted least squares (IRLS)](https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares) by [IterativelyReweightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala).
    +It can be used to find the maximum likelihood estimates of a generalized linear model (GLM), find M-estimator in robust regression and other optimization problems.
    +Refer to [Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives](http://www.jstor.org/stable/2345503) for more information.
    +
    +It solves certain optimization problems iteratively:
    +
    +* linearize the objective at current solution and update corresponding weight.
    +* solve a weighted least squares (WLS) problem by WeightedLeastSquares.
    +* repeat above steps until convergence.
    +
    +Since it involves solving a weighted least squares (WLS) problem by WeightedLeastSquares in each iteration,
    --- End diff --
    
    Since we mentioned IRLS called WLS, it's straight forward for users to refer WLS for the reason of the constraint.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11959] [SPARK-15484] [Doc] [ML] Documen...

Posted by BenFradet <gi...@git.apache.org>.

Github user BenFradet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13262#discussion_r64228185
  
    --- Diff: docs/ml-advanced.md ---
    @@ -4,10 +4,85 @@ title: Advanced topics - spark.ml
     displayTitle: Advanced topics - spark.ml
     ---
     
    -# Optimization of linear methods
    +* Table of contents
    +{:toc}
    +
    +`\[
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}} 
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +\]`
    +
    +# Optimization of linear methods (developer)
    +
    +## Limited-memory BFGS (L-BFGS)
    +[L-BFGS](http://en.wikipedia.org/wiki/Limited-memory_BFGS) is an optimization 
    +algorithm in the family of quasi-Newton methods to solve the optimization problems of the form 
    +`$\min_{\wv \in\R^d} \; f(\wv)$`. The L-BFGS method approximates the objective function locally as a 
    +quadratic without evaluating the second partial derivatives of the objective function to construct the 
    +Hessian matrix. The Hessian matrix is approximated by previous gradient evaluations, so there is no 
    +vertical scalability issue (the number of training features) when computing the Hessian matrix 
    +explicitly in Newton's method. As a result, L-BFGS often achieves rapider convergence compared with 
    +other first-order optimization.
     
    -The optimization algorithm underlying the implementation is called
     [Orthant-Wise Limited-memory
     QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
    -(OWL-QN). It is an extension of L-BFGS that can effectively handle L1
    -regularization and elastic net.
    +(OWL-QN) is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
    +
    +L-BFGS was used as solver for [LinearRegression](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression),
    +[LogisticRegression](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression),
    +[AFTSurvivalRegression](api/scala/index.html#org.apache.spark.ml.regression.AFTSurvivalRegression)
    +and [MultilayerPerceptronClassifier](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier).
    +
    +`spark.ml` L-BFGS solver calls the corresponding implementation in [breeze](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala).
    +
    +## Normal equation solver for weighted least squares (normal)
    +
    +`spark.ml` implements normal equation solver for weighted least squares by [WeightedLeastSquares](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala).
    +
    +Given $n$ weighted observations $(w_i, a_i, b_i)$:
    +
    +* $w_i$ the weight of i-th observation
    +* $a_i$ the features vector of i-th observation
    +* $b_i$ the label of i-th observation
    +
    +The number of features for each observation is $m$. We use the following weighted least squares formulation:
    +`\[   
    +minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{i=1}^n w_i} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2
    +\]`
    +where $\lambda$ is the regularization parameter, $\delta$ is the population standard deviation of label
    +and $\sigma_j$ is the population standard deviation of the j-th feature column.
    +
    +This objective function has an analytic solution and it requires only one pass to collect necessary statistics to solve this function.
    +Unlike the original dataset which can only be stored in distributed system,
    +these statistics can be easily loaded into memory of a single machine, and then we can solve the objective function through Cholesky factorization on driver.
    --- End diff --
    
    On the driver


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org