You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2016/07/03 06:20:18 UTC

[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/14035

    [SPARK-16356][ML] Add testImplicits for ML unit tests and promote toDF()

    ## What changes were proposed in this pull request?
    
    This was suggested in https://github.com/apache/spark/commit/101663f1ae222a919fc40510aa4f2bad22d1be6f#commitcomment-17114968.
    
    This PR adds `testImplicits` to `MLlibTestSparkContext` so that some implicits such as `toDF()` can be sued across ml tests.
    
    This PR also changes all the usages of `spark.createDataFrame( ... )` to `toDF()` where applicable in ml tests in Scala.
    
    ## How was this patch tested?
    
    Existing tests should work.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark minor-ml-test

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14035.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14035
    
----
commit 79453ac4806bc55dc5ac57da1fb4c706cdd0a762
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-06-28T04:25:35Z

    Promote toDF() instead of createDataFrame from a Product-type RDD

commit 902b9132a029df3f879d70e0f01c04b640a97ebc
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-06-28T04:35:49Z

    Fix indentation

commit 0df2e44c1871ce30a29878450b0d2024779a3e73
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-06-29T03:31:15Z

    Add some more tests to use toDF API

commit 4f1fc1cfdd9d3cd55ce56b852d5a4a6d6b7ea958
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-07-03T04:46:03Z

    Fetch upstream

commit 5f7f85b40709eee0eb261edd24eaaef9b7fc3783
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-07-03T05:45:43Z

    Fix some more cases

commit 52e7f1601df73dc35aac7627a6e0466b19cd8248
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-07-03T05:56:24Z

    Take out the change in SQL and consistent imports

commit 54c27d4d359a7e6ad445856e06f15e29132d582c
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-07-03T06:12:35Z

    Remove unused imports and cleanup nits

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69432798
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala ---
    @@ -55,7 +56,7 @@ class LogisticRegressionSuite
             generateMultinomialLogisticInput(coefficients, xMean, xVariance,
               addIntercept = true, nPoints, 42)
     
    -      spark.createDataFrame(sc.parallelize(testData, 4))
    +      sc.parallelize(testData, 4).toDF()
    --- End diff --
    
    It'd be nice to know what was the purpose of the explicit partition setting.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Hi @jkbradley, could you take a look for this one please? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65758 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65758/consoleFull)** for PR 14035 at commit [`13b1a67`](https://github.com/apache/spark/commit/13b1a6751902493e458af162b222aebf879d41da).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390117
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/ClassifierSuite.scala ---
    @@ -71,8 +71,7 @@ class ClassifierSuite extends SparkFunSuite with MLlibTestSparkContext {
     
       test("getNumClasses") {
         def getTestData(labels: Seq[Double]): DataFrame = {
    --- End diff --
    
    repeated. What about Moving it outside `test` methods?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65883/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390189
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala ---
    @@ -158,7 +159,7 @@ class RandomForestClassifierSuite
       }
     
       test("Fitting without numClasses in metadata") {
    -    val df: DataFrame = spark.createDataFrame(TreeTests.featureImportanceData(sc))
    +    val df: DataFrame = TreeTests.featureImportanceData(sc).toDF()
    --- End diff --
    
    Why is the type annotation needed here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65881/consoleFull)** for PR 14035 at commit [`ad9d7ac`](https://github.com/apache/spark/commit/ad9d7ac49b26805eb3c975dc61bd15d2d47c34a4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65883 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65883/consoleFull)** for PR 14035 at commit [`b60c952`](https://github.com/apache/spark/commit/b60c95279c809c59cd08ed1e1b239b3688f50b12).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Thank you for reviewing this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Thank you!!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #63487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63487/consoleFull)** for PR 14035 at commit [`5157d77`](https://github.com/apache/spark/commit/5157d77127541bd39517a5802618d6c1f7f6beaa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    ping @mengxr and @yanboliang 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390273
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/MinMaxScalerSuite.scala ---
    @@ -57,8 +58,7 @@ class MinMaxScalerSuite extends SparkFunSuite with MLlibTestSparkContext with De
     
       test("MinMaxScaler arguments max must be larger than min") {
         withClue("arguments max must be larger than min") {
    -      val dummyDF = spark.createDataFrame(Seq(
    -        (1, Vectors.dense(1.0, 2.0)))).toDF("id", "feature")
    +      val dummyDF = Seq((1, Vectors.dense(1.0, 2.0))).toDF("id", "feature")
    --- End diff --
    
    It's just a column name, but for consistency...`features` (not `feature`) (unless there's a reason for this)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Thanks @yanboliang  and @jaceklaskowski . I addressed comments except for few comments I am not too sure of and I think are not related changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Gentle ping @mengxr and @yanboliang


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65758 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65758/consoleFull)** for PR 14035 at commit [`13b1a67`](https://github.com/apache/spark/commit/13b1a6751902493e458af162b222aebf879d41da).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390144
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala ---
    @@ -55,7 +56,7 @@ class LogisticRegressionSuite
             generateMultinomialLogisticInput(coefficients, xMean, xVariance,
               addIntercept = true, nPoints, 42)
     
    -      spark.createDataFrame(sc.parallelize(testData, 4))
    +      sc.parallelize(testData, 4).toDF()
    --- End diff --
    
    `testData.toDF.repartition(4)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63487/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69400558
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala ---
    @@ -158,7 +159,7 @@ class RandomForestClassifierSuite
       }
     
       test("Fitting without numClasses in metadata") {
    -    val df: DataFrame = spark.createDataFrame(TreeTests.featureImportanceData(sc))
    +    val df: DataFrame = TreeTests.featureImportanceData(sc).toDF()
    --- End diff --
    
    I also agree with this but actually it seems both are fine assuming from this discussion, https://github.com/apache/spark/pull/12452


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r80376695
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/RegressionEvaluatorSuite.scala ---
    @@ -42,9 +43,10 @@ class RegressionEvaluatorSuite
          * data.map(x=> x.label + ", " + x.features(0) + ", " + x.features(1))
          *   .saveAsTextFile("path")
          */
    -    val dataset = spark.createDataFrame(
    -      sc.parallelize(LinearDataGenerator.generateLinearInput(
    -        6.3, Array(4.7, 7.2), Array(0.9, -1.3), Array(0.7, 1.2), 100, 42, 0.1), 2).map(_.asML))
    +    val dataset = sc.parallelize(
    --- End diff --
    
    +1 @jaceklaskowski 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390204
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/RegressionEvaluatorSuite.scala ---
    @@ -42,9 +43,10 @@ class RegressionEvaluatorSuite
          * data.map(x=> x.label + ", " + x.features(0) + ", " + x.features(1))
          *   .saveAsTextFile("path")
          */
    -    val dataset = spark.createDataFrame(
    -      sc.parallelize(LinearDataGenerator.generateLinearInput(
    -        6.3, Array(4.7, 7.2), Array(0.9, -1.3), Array(0.7, 1.2), 100, 42, 0.1), 2).map(_.asML))
    +    val dataset = sc.parallelize(
    --- End diff --
    
    Remove `sc.parallelize`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r80376849
  
    --- Diff: mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala ---
    @@ -282,9 +281,7 @@ class MLUtilsSuite extends SparkFunSuite with MLlibTestSparkContext {
         val z = Vectors.dense(4.0).asML
         val p = (5.0, z)
         val w = Vectors.dense(6.0)
    -    val df = spark.createDataFrame(Seq(
    -      (0, x, y, p, w)
    -    )).toDF("id", "x", "y", "p", "w")
    +    val df = Seq((0, x, y, p, w)).toDF("id", "x", "y", "p", "w")
           .withColumn("x", col("x"), metadata)
    --- End diff --
    
    We are more prefer to use ```col("x")``` for DataFrame operation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390147
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala ---
    @@ -869,8 +870,7 @@ class LogisticRegressionSuite
             }
           }
     
    -      (spark.createDataFrame(sc.parallelize(data1, 4)),
    -        spark.createDataFrame(sc.parallelize(data2, 4)))
    +      (sc.parallelize(data1, 4).toDF(), sc.parallelize(data2, 4).toDF())
    --- End diff --
    
    Same as above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61679/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69391151
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala ---
    @@ -52,23 +53,20 @@ class GeneralizedLinearRegressionSuite
     
         import GeneralizedLinearRegressionSuite._
     
    -    datasetGaussianIdentity = spark.createDataFrame(
    -      sc.parallelize(generateGeneralizedLinearRegressionInput(
    -        intercept = 2.5, coefficients = Array(2.2, 0.6), xMean = Array(2.9, 10.5),
    -        xVariance = Array(0.7, 1.2), nPoints = 10000, seed, noiseLevel = 0.01,
    -        family = "gaussian", link = "identity"), 2))
    +    datasetGaussianIdentity = sc.parallelize(generateGeneralizedLinearRegressionInput(
    --- End diff --
    
    Why is this `sc.parallelize` needed here? Why are `2` partitions used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390185
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
    @@ -55,7 +56,7 @@ class OneVsRestSuite extends SparkFunSuite with MLlibTestSparkContext with Defau
         val xVariance = Array(0.6856, 0.1899, 3.116, 0.581)
         rdd = sc.parallelize(generateMultinomialLogisticInput(
           coefficients, xMean, xVariance, true, nPoints, 42), 2)
    -    dataset = spark.createDataFrame(rdd)
    +    dataset = rdd.toDF()
    --- End diff --
    
    Merge it with line 57.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390423
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/OneHotEncoderSuite.scala ---
    @@ -29,10 +29,11 @@ import org.apache.spark.sql.types._
     
     class OneHotEncoderSuite
       extends SparkFunSuite with MLlibTestSparkContext with DefaultReadWriteTest {
    +  import testImplicits._
     
       def stringIndexed(): DataFrame = {
         val data = sc.parallelize(Seq((0, "a"), (1, "b"), (2, "c"), (3, "a"), (4, "a"), (5, "c")), 2)
    --- End diff --
    
    Remove `sc.parallelize`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r80449593
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/ChiSqSelectorSuite.scala ---
    @@ -29,8 +29,7 @@ class ChiSqSelectorSuite extends SparkFunSuite with MLlibTestSparkContext
       with DefaultReadWriteTest {
     
       test("Test Chi-Square selector") {
    -    val spark = this.spark
    -    import spark.implicits._
    +    import testImplicits._
    --- End diff --
    
    Nit: Actually it should be moved out of this test function and can be shared between all test cases if necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #61679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61679/consoleFull)** for PR 14035 at commit [`54c27d4`](https://github.com/apache/spark/commit/54c27d4d359a7e6ad445856e06f15e29132d582c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69400523
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala ---
    @@ -116,7 +117,7 @@ class MultilayerPerceptronClassifierSuite
         // the input seed is somewhat magic, to make this test pass
         val rdd = sc.parallelize(generateMultinomialLogisticInput(
           coefficients, xMean, xVariance, true, nPoints, 1), 2)
    -    val dataFrame = spark.createDataFrame(rdd).toDF("label", "features")
    +    val dataFrame = rdd.toDF("label", "features")
    --- End diff --
    
    Again, I also agree with this but I am hesitated to change this because it is explicitly set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #63487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63487/consoleFull)** for PR 14035 at commit [`5157d77`](https://github.com/apache/spark/commit/5157d77127541bd39517a5802618d6c1f7f6beaa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64632/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390216
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala ---
    @@ -44,7 +45,7 @@ class CountVectorizerSuite extends SparkFunSuite with MLlibTestSparkContext
           (3, split(""), Vectors.sparse(4, Seq())), // empty string
    --- End diff --
    
    Replace the comment `// empty string` with `val EMPTY_STRING = ""`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14035


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Hi @mengxr, @yanboliang and @jkbradley, if these changes are so big, I can just leave `testImplicits` and let others fix them later without sweeping. Could you please take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390388
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/NormalizerSuite.scala ---
    @@ -61,7 +62,7 @@ class NormalizerSuite extends SparkFunSuite with MLlibTestSparkContext with Defa
           Vectors.sparse(3, Seq())
         )
     
    -    dataFrame = spark.createDataFrame(sc.parallelize(data, 2).map(NormalizerSuite.FeatureData))
    +    dataFrame = sc.parallelize(data, 2).map(NormalizerSuite.FeatureData).toDF()
    --- End diff --
    
    Remove `sc.parallelize`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69391176
  
    --- Diff: mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala ---
    @@ -282,9 +281,7 @@ class MLUtilsSuite extends SparkFunSuite with MLlibTestSparkContext {
         val z = Vectors.dense(4.0).asML
         val p = (5.0, z)
         val w = Vectors.dense(6.0)
    -    val df = spark.createDataFrame(Seq(
    -      (0, x, y, p, w)
    -    )).toDF("id", "x", "y", "p", "w")
    +    val df = Seq((0, x, y, p, w)).toDF("id", "x", "y", "p", "w")
           .withColumn("x", col("x"), metadata)
    --- End diff --
    
    Replace `col("x")` with `$"x"` or (better) `'x`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65883 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65883/consoleFull)** for PR 14035 at commit [`b60c952`](https://github.com/apache/spark/commit/b60c95279c809c59cd08ed1e1b239b3688f50b12).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    ping @mengxr and @yanboliang 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65757/consoleFull)** for PR 14035 at commit [`2cbcabd`](https://github.com/apache/spark/commit/2cbcabdcef32280316db1ede1a22934dacf3cf35).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Sorry I'm seeing this so late, but thank you all for the PR & reviews!
    
    @jaceklaskowski  Regarding the explicit partitioning in unit tests, that's historical: In the past, we had run into some bugs which only showed up with multiple partitions, so we got in the habit of using multiple partitions.  I still think it's a nice idea in general, though perhaps there are fewer such bugs nowadays.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69400465
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala ---
    @@ -55,7 +56,7 @@ class LogisticRegressionSuite
             generateMultinomialLogisticInput(coefficients, xMean, xVariance,
               addIntercept = true, nPoints, 42)
     
    -      spark.createDataFrame(sc.parallelize(testData, 4))
    +      sc.parallelize(testData, 4).toDF()
    --- End diff --
    
    I guess, to be strict, `sc.parallelize(testData, 4).toDF()` and `testData.toDF.repartition(4)` would not be exactly the same. It seems the author of this test code intended to explicitly set the initial number of partitions to `4` and I left as it is although I think as you said because I am not 100% sure and it is not the part of this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65757/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69391139
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorIndexerSuite.scala ---
    @@ -102,7 +103,7 @@ class VectorIndexerSuite extends SparkFunSuite with MLlibTestSparkContext
       }
     
       test("Cannot fit an empty DataFrame") {
    -    val rdd = spark.createDataFrame(sc.parallelize(Array.empty[Vector], 2).map(FeatureData))
    +    val rdd = sc.parallelize(Array.empty[Vector], 2).map(FeatureData).toDF()
    --- End diff --
    
    Do you need `sc.parallelize`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390173
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala ---
    @@ -116,7 +117,7 @@ class MultilayerPerceptronClassifierSuite
         // the input seed is somewhat magic, to make this test pass
         val rdd = sc.parallelize(generateMultinomialLogisticInput(
           coefficients, xMean, xVariance, true, nPoints, 1), 2)
    -    val dataFrame = spark.createDataFrame(rdd).toDF("label", "features")
    +    val dataFrame = rdd.toDF("label", "features")
    --- End diff --
    
    Could we merge this line with 118? I don't think 118 needs `sc.parallelize`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r80376739
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/MinMaxScalerSuite.scala ---
    @@ -57,8 +58,7 @@ class MinMaxScalerSuite extends SparkFunSuite with MLlibTestSparkContext with De
     
       test("MinMaxScaler arguments max must be larger than min") {
         withClue("arguments max must be larger than min") {
    -      val dummyDF = spark.createDataFrame(Seq(
    -        (1, Vectors.dense(1.0, 2.0)))).toDF("id", "feature")
    +      val dummyDF = Seq((1, Vectors.dense(1.0, 2.0))).toDF("id", "feature")
    --- End diff --
    
    +1 @jaceklaskowski 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390178
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/NaiveBayesSuite.scala ---
    @@ -47,7 +48,7 @@ class NaiveBayesSuite extends SparkFunSuite with MLlibTestSparkContext with Defa
           Array(0.10, 0.10, 0.70, 0.10)  // label 2
         ).map(_.map(math.log))
     
    -    dataset = spark.createDataFrame(generateNaiveBayesInput(pi, theta, 100, 42))
    +    dataset = generateNaiveBayesInput(pi, theta, 100, 42).toDF()
    --- End diff --
    
    Exactly my point above :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    hm.. I can close if it looks inappropriate or it seems making a lot of conflicts across PRs. Could you give some feedback please @mengxr and @yanboliang ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #61679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61679/consoleFull)** for PR 14035 at commit [`54c27d4`](https://github.com/apache/spark/commit/54c27d4d359a7e6ad445856e06f15e29132d582c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Hi @mengxr, is this the change you meant?  Could you please take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61681/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #61681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61681/consoleFull)** for PR 14035 at commit [`6fdf290`](https://github.com/apache/spark/commit/6fdf290d682b190f6b0457bb084cebacd105ee00).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r80379623
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
    @@ -55,7 +56,7 @@ class OneVsRestSuite extends SparkFunSuite with MLlibTestSparkContext with Defau
         val xVariance = Array(0.6856, 0.1899, 3.116, 0.581)
         rdd = sc.parallelize(generateMultinomialLogisticInput(
           coefficients, xMean, xVariance, true, nPoints, 42), 2)
    -    dataset = spark.createDataFrame(rdd)
    +    dataset = rdd.toDF()
    --- End diff --
    
    It seems the `rdd` is being used in the tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Sorry for late response. I like this change and will have a look soon. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    LGTM, merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    cc @mengxr, @yanboliang and @jaceklaskowski 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #61681 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61681/consoleFull)** for PR 14035 at commit [`6fdf290`](https://github.com/apache/spark/commit/6fdf290d682b190f6b0457bb084cebacd105ee00).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65757/consoleFull)** for PR 14035 at commit [`2cbcabd`](https://github.com/apache/spark/commit/2cbcabdcef32280316db1ede1a22934dacf3cf35).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65758/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69391120
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
    @@ -39,7 +40,7 @@ class StringIndexerSuite
     
       test("StringIndexer") {
         val data = sc.parallelize(Seq((0, "a"), (1, "b"), (2, "c"), (3, "a"), (4, "a"), (5, "c")), 2)
    --- End diff --
    
    Could you remove `sc.parallelize`, too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65881/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #64632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64632/consoleFull)** for PR 14035 at commit [`f2990b1`](https://github.com/apache/spark/commit/f2990b1b8240a43a7a5fcee5b39ada05b432aef1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r80380667
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorIndexerSuite.scala ---
    @@ -85,11 +87,13 @@ class VectorIndexerSuite extends SparkFunSuite with MLlibTestSparkContext
         checkPair(densePoints1Seq, sparsePoints1Seq)
         checkPair(densePoints2Seq, sparsePoints2Seq)
     
    -    densePoints1 = spark.createDataFrame(sc.parallelize(densePoints1Seq, 2).map(FeatureData))
    -    sparsePoints1 = spark.createDataFrame(sc.parallelize(sparsePoints1Seq, 2).map(FeatureData))
    -    densePoints2 = spark.createDataFrame(sc.parallelize(densePoints2Seq, 2).map(FeatureData))
    -    sparsePoints2 = spark.createDataFrame(sc.parallelize(sparsePoints2Seq, 2).map(FeatureData))
    -    badPoints = spark.createDataFrame(sc.parallelize(badPointsSeq, 2).map(FeatureData))
    +    densePoints1 = densePoints1Seq.map(FeatureData).toDF()
    +    sparsePoints1 = sparsePoints1Seq.map(FeatureData).toDF()
    +    // TODO: If we directly use `toDF` without parallelize, the test in
    +    // "Throws error when given RDDs with different size vectors" is failed for an unknown reason.
    +    densePoints2 = sc.parallelize(densePoints2Seq, 2).map(FeatureData).toDF()
    --- End diff --
    
    BTW, It seems a test is failed when I change this to `densePoints2Seq.map(FeatureData).toDF()` for an unknown reason.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r80376680
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala ---
    @@ -116,7 +117,7 @@ class MultilayerPerceptronClassifierSuite
         // the input seed is somewhat magic, to make this test pass
         val rdd = sc.parallelize(generateMultinomialLogisticInput(
           coefficients, xMean, xVariance, true, nPoints, 1), 2)
    -    val dataFrame = spark.createDataFrame(rdd).toDF("label", "features")
    +    val dataFrame = rdd.toDF("label", "features")
    --- End diff --
    
    +1 @jaceklaskowski 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    @HyukjinKwon I have made a pass and this PR look good overall. Could you double check whether all ML test cases are covered? Since I found we used implicit import of different style at [```ChiSqSelectorSuite```](https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/ml/feature/ChiSqSelectorSuite.scala#L33), it's better we can unify them. Then I'd like to get this in. Thanks for working on this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #64632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64632/consoleFull)** for PR 14035 at commit [`f2990b1`](https://github.com/apache/spark/commit/f2990b1b8240a43a7a5fcee5b39ada05b432aef1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    **[Test build #65881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65881/consoleFull)** for PR 14035 at commit [`ad9d7ac`](https://github.com/apache/spark/commit/ad9d7ac49b26805eb3c975dc61bd15d2d47c34a4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14035
  
    @mengxr, @yanboliang, Could you review this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14035: [SPARK-16356][ML] Add testImplicits for ML unit t...

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14035#discussion_r69390132
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala ---
    @@ -134,15 +135,14 @@ class GBTClassifierSuite extends SparkFunSuite with MLlibTestSparkContext
       */
     
       test("Fitting without numClasses in metadata") {
    -    val df: DataFrame = spark.createDataFrame(TreeTests.featureImportanceData(sc))
    +    val df: DataFrame = TreeTests.featureImportanceData(sc).toDF()
         val gbt = new GBTClassifier().setMaxDepth(1).setMaxIter(1)
         gbt.fit(df)
    --- End diff --
    
    Wonder why this line is separate not part of 139? Any reason?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org