You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by tillrohrmann <gi...@git.apache.org> on 2015/06/02 16:59:25 UTC

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/760

    [FLINK-1993] [ml] Replaces custom SGD in MultipleLinearRegression with optimizer's SGD

    This PR replaces the custom SGD implementation with the optimization framework's SGD implementation.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink replaceSGDInMLR

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/760.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #760
    
----
commit 2df0c700237af0456c3770c198ed595fdf205408
Author: Till Rohrmann <tr...@apache.org>
Date:   2015-05-29T16:02:47Z

    [FLINK-1993] [ml] Replaces custom SGD logic with optimization framework's SGD in MultipleLinearRegression
    
    Fixes PipelineITSuite because of change MLR loss function

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/760#discussion_r31632765
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/pipeline/Predictor.scala ---
    @@ -35,7 +35,7 @@ import org.apache.flink.ml.common.{FlinkMLTools, ParameterMap, WithParameters}
       *
       * @tparam Self Type of the implementing class
       */
    -trait Predictor[Self] extends Estimator[Self] with WithParameters with Serializable {
    +trait Predictor[Self] extends Estimator[Self] with WithParameters {
    --- End diff --
    
    The `Estimator` types should never be shipped to the TaskManager. Usually they contain some state in the form of other `DataSets` which aren't serializable anyways. However, the operations have to be `Serializable`, because they can contain values which are referenced by the Flink operations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by thvasilo <gi...@git.apache.org>.
Github user thvasilo commented on a diff in the pull request:

    https://github.com/apache/flink/pull/760#discussion_r31604529
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala ---
    @@ -87,11 +89,11 @@ import org.apache.flink.ml.pipeline.{FitOperation, PredictOperation, Predictor}
       *
       */
     class MultipleLinearRegression extends Predictor[MultipleLinearRegression] {
    -
    +  import org.apache.flink.ml._
    --- End diff --
    
    Line 49 typo: iteratinos -> iterations


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by thvasilo <gi...@git.apache.org>.
Github user thvasilo commented on the pull request:

    https://github.com/apache/flink/pull/760#issuecomment-108254611
  
    Looks good, some minor comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by thvasilo <gi...@git.apache.org>.
Github user thvasilo commented on a diff in the pull request:

    https://github.com/apache/flink/pull/760#discussion_r31604864
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala ---
    @@ -309,8 +207,10 @@ object MultipleLinearRegression {
           : DataSet[LabeledVector] = {
    --- End diff --
    
    Docstring for the return type?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/760


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by thvasilo <gi...@git.apache.org>.
Github user thvasilo commented on a diff in the pull request:

    https://github.com/apache/flink/pull/760#discussion_r31604106
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/pipeline/Predictor.scala ---
    @@ -35,7 +35,7 @@ import org.apache.flink.ml.common.{FlinkMLTools, ParameterMap, WithParameters}
       *
       * @tparam Self Type of the implementing class
       */
    -trait Predictor[Self] extends Estimator[Self] with WithParameters with Serializable {
    +trait Predictor[Self] extends Estimator[Self] with WithParameters {
    --- End diff --
    
    Why is Serializable no longer needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/760#discussion_r31633196
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala ---
    @@ -309,8 +207,10 @@ object MultipleLinearRegression {
           : DataSet[LabeledVector] = {
    --- End diff --
    
    Good point. Will add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/760#issuecomment-108479856
  
    Thanks for reviewing @thvasilo. I addressed your comments and updated the PR. When Travis gives green light, then I'll merge this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/760#discussion_r31632977
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala ---
    @@ -87,11 +89,11 @@ import org.apache.flink.ml.pipeline.{FitOperation, PredictOperation, Predictor}
       *
       */
     class MultipleLinearRegression extends Predictor[MultipleLinearRegression] {
    -
    +  import org.apache.flink.ml._
    --- End diff --
    
    Good catch :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---