You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by sharp-pixel <gi...@git.apache.org> on 2017/08/17 22:03:13 UTC

[GitHub] spark pull request #18980: Correct validateAndTransformSchema in GaussianMix...

GitHub user sharp-pixel opened a pull request:

    https://github.com/apache/spark/pull/18980

    Correct validateAndTransformSchema in GaussianMixture

    ## What changes were proposed in this pull request?
    
    The line SchemaUtils.appendColumn(schema, $(predictionCol), IntegerType) did not modify the variable schema, hence only the last line had any effect. A temporary variable is used to correctly append the two columns predictionCol and probabilityCol.
    
    ## How was this patch tested?
    
    Manually.
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sharp-pixel/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18980.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18980
    
----
commit fa5e3f94801bbdeb5f2d303a79fb0ebd89733436
Author: Cédric Pelvet <ce...@gmail.com>
Date:   2017-08-17T21:48:58Z

    Correct validateAndTransformSchema

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: [MINOR]Correct validateAndTransformSchema in GaussianMix...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    Merged to master/2.2/2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    +1 @srowen, this is a bug.
    @sharp-pixel Would you mind to fix both ```GaussianMixture``` and ```AFTSurvivalRegression```? It's better to file a JIRA firstly and add some unit tests. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    **[Test build #3892 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3892/testReport)** for PR 18980 at commit [`d14c21e`](https://github.com/apache/spark/commit/d14c21e14b37501f75cf39244da7d93fa66b47fc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    Yeah I think you're right. There's a similar issue in `AFTSurvivalRegressionParams`
    CC @yanboliang 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

Posted by sharp-pixel <gi...@git.apache.org>.
Github user sharp-pixel commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    I just fixed AFTSurvivalRegression also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18980: Correct validateAndTransformSchema in GaussianMix...

Posted by sharp-pixel <gi...@git.apache.org>.
Github user sharp-pixel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18980#discussion_r134088869
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
    @@ -64,8 +64,8 @@ private[clustering] trait GaussianMixtureParams extends Params with HasMaxIter w
        */
       protected def validateAndTransformSchema(schema: StructType): StructType = {
         SchemaUtils.checkColumnType(schema, $(featuresCol), new VectorUDT)
    -    SchemaUtils.appendColumn(schema, $(predictionCol), IntegerType)
    -    SchemaUtils.appendColumn(schema, $(probabilityCol), new VectorUDT)
    +    val schema_temp = SchemaUtils.appendColumn(schema, $(predictionCol), IntegerType)
    --- End diff --
    
    My bad! Just committed with more descriptive variables in camelCase. Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: [MINOR]Correct validateAndTransformSchema in GaussianMix...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    Actually, on second thought, we should make a JIRA for this and link it. It's a non-trivial bug fix. I can do that if you're busy, but feel free to. I'll merge it soon anyway


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: [MINOR]Correct validateAndTransformSchema in GaussianMix...

Posted by sharp-pixel <gi...@git.apache.org>.
Github user sharp-pixel commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    As I am fairly new to JIRA, feel free to do that if you can. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    I started tests. One last request, start the title with `[MINOR]`. See https://spark.apache.org/contributing.html


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18980: Correct validateAndTransformSchema in GaussianMix...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18980#discussion_r134087237
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
    @@ -64,8 +64,8 @@ private[clustering] trait GaussianMixtureParams extends Params with HasMaxIter w
        */
       protected def validateAndTransformSchema(schema: StructType): StructType = {
         SchemaUtils.checkColumnType(schema, $(featuresCol), new VectorUDT)
    -    SchemaUtils.appendColumn(schema, $(predictionCol), IntegerType)
    -    SchemaUtils.appendColumn(schema, $(probabilityCol), new VectorUDT)
    +    val schema_temp = SchemaUtils.appendColumn(schema, $(predictionCol), IntegerType)
    --- End diff --
    
    Tiny nit: Java/Scala use camelCase var names. Maybe call this `withPredictionCol` or something, as it's more descriptive anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18980
  
    **[Test build #3892 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3892/testReport)** for PR 18980 at commit [`d14c21e`](https://github.com/apache/spark/commit/d14c21e14b37501f75cf39244da7d93fa66b47fc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18980: [MINOR]Correct validateAndTransformSchema in Gaus...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18980


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org