You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by actuaryzhang <gi...@git.apache.org> on 2017/05/26 16:45:19 UTC

[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

GitHub user actuaryzhang opened a pull request:

    https://github.com/apache/spark/pull/18122

    [SPARK-20899][PySpark] PySpark supports stringIndexerOrderType in RFormula

    ## What changes were proposed in this pull request?
    
    PySpark supports stringIndexerOrderType in RFormula as in #17967. 
    
    ## How was this patch tested?
    docstring test

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/actuaryzhang/spark PythonRFormula

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18122.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18122
    
----
commit 4bca4d95613e6e18361de8fe0a36667182c2d446
Author: actuaryzhang <ac...@gmail.com>
Date:   2017-05-26T07:40:22Z

    Pyhton port for Rformula stringIndexerOrderType

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by actuaryzhang <gi...@git.apache.org>.
Github user actuaryzhang commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    @yanboliang I have moved the tests to the test file. Please let me know if there is anything else needed. Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77537/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77506 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77506/testReport)** for PR 18122 at commit [`3510e24`](https://github.com/apache/spark/commit/3510e24379a26551edd7abf2bf8f3fb08ec42aba).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77508 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77508/testReport)** for PR 18122 at commit [`320203e`](https://github.com/apache/spark/commit/320203eeea6d7613bb091f01b170fbfa2805b2a0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18122#discussion_r119146702
  
    --- Diff: python/pyspark/ml/tests.py ---
    @@ -538,6 +538,19 @@ def test_rformula_force_index_label(self):
             transformedDF2 = model2.transform(df)
             self.assertEqual(transformedDF2.head().label, 0.0)
     
    +    def test_rformula_string_indexer_order_type(self):
    +        df = self.spark.createDataFrame([
    +            (1.0, 1.0, "a"),
    +            (0.0, 2.0, "b"),
    +            (1.0, 0.0, "a")], ["y", "x", "s"])
    +        rf = RFormula(formula="y ~ x + s", stringIndexerOrderType="alphabetDesc")
    +        self.assertEqual(rf.getStringIndexerOrderType(), 'alphabetDesc')
    +        transformedDF = rf.fit(df).transform(df)
    +        observed = transformedDF.select("features").collect()
    +        expected = [[1.0, 0.0], [2.0, 1.0], [0.0, 0.0]]
    +        for i in range(0, len(expected)):
    +            self.assertTrue((observed[i]["features"].toArray() == expected[i]).all())
    --- End diff --
    
    Minor: Usually we're more prefer to use ```self.assertTrue(all(observed[i]["features"].toArray() == expected[i]))```.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18122


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77537 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77537/testReport)** for PR 18122 at commit [`2e854a8`](https://github.com/apache/spark/commit/2e854a88ff83d8533225240c2394db8498fbfe25).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77428/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18122#discussion_r118780954
  
    --- Diff: python/pyspark/ml/feature.py ---
    @@ -3043,26 +3055,35 @@ class RFormula(JavaEstimator, HasFeaturesCol, HasLabelCol, JavaMLReadable, JavaM
                                 "Force to index label whether it is numeric or string",
                                 typeConverter=TypeConverters.toBoolean)
     
    +    stringIndexerOrderType = Param(Params._dummy(), "stringIndexerOrderType",
    +                                   "How to order categories of a string FEATURE column used by " +
    --- End diff --
    
    FEATURE capitalize is common here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18122#discussion_r118844255
  
    --- Diff: python/pyspark/ml/feature.py ---
    @@ -3032,6 +3032,18 @@ class RFormula(JavaEstimator, HasFeaturesCol, HasLabelCol, JavaMLReadable, JavaM
         ...
         >>> str(loadedModel)
         'RFormulaModel(ResolvedRFormula(label=y, terms=[x,s], hasIntercept=true)) (uid=...)'
    +    >>> rf = RFormula(formula="y ~ x + s", stringIndexerOrderType="alphabetDesc")
    +    >>> rf.getStringIndexerOrderType()
    +    'alphabetDesc'
    +    >>> rf.fit(df).transform(df).show()
    +    +---+---+---+---------+-----+
    +    |  y|  x|  s| features|label|
    +    +---+---+---+---------+-----+
    +    |1.0|1.0|  a|[1.0,0.0]|  1.0|
    +    |0.0|2.0|  b|[2.0,1.0]|  0.0|
    +    |0.0|0.0|  a|(2,[],[])|  0.0|
    +    +---+---+---+---------+-----+
    +    ...
    --- End diff --
    
    Could you move the newly added test to ```tests.py```? We keep the basic doc tests here both for test and example, other tests should be placed at ```tests.py```. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77428/testReport)** for PR 18122 at commit [`4bca4d9`](https://github.com/apache/spark/commit/4bca4d95613e6e18361de8fe0a36667182c2d446).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77509 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77509/testReport)** for PR 18122 at commit [`4af4b35`](https://github.com/apache/spark/commit/4af4b3500de27acb0128763be755ea8078736d60).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77509 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77509/testReport)** for PR 18122 at commit [`4af4b35`](https://github.com/apache/spark/commit/4af4b3500de27acb0128763be755ea8078736d60).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77440/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77509/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77428 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77428/testReport)** for PR 18122 at commit [`4bca4d9`](https://github.com/apache/spark/commit/4bca4d95613e6e18361de8fe0a36667182c2d446).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77508/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77506/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77506 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77506/testReport)** for PR 18122 at commit [`3510e24`](https://github.com/apache/spark/commit/3510e24379a26551edd7abf2bf8f3fb08ec42aba).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77508 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77508/testReport)** for PR 18122 at commit [`320203e`](https://github.com/apache/spark/commit/320203eeea6d7613bb091f01b170fbfa2805b2a0).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class SparkMLTests(ReusedPySparkTestCase):`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77440 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77440/testReport)** for PR 18122 at commit [`c3f4430`](https://github.com/apache/spark/commit/c3f44303636654232347af38841e5347a63a860f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77537 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77537/testReport)** for PR 18122 at commit [`2e854a8`](https://github.com/apache/spark/commit/2e854a88ff83d8533225240c2394db8498fbfe25).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

Posted by actuaryzhang <gi...@git.apache.org>.
Github user actuaryzhang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18122#discussion_r118796569
  
    --- Diff: python/pyspark/ml/feature.py ---
    @@ -3043,26 +3055,35 @@ class RFormula(JavaEstimator, HasFeaturesCol, HasLabelCol, JavaMLReadable, JavaM
                                 "Force to index label whether it is numeric or string",
                                 typeConverter=TypeConverters.toBoolean)
     
    +    stringIndexerOrderType = Param(Params._dummy(), "stringIndexerOrderType",
    +                                   "How to order categories of a string FEATURE column used by " +
    --- End diff --
    
    Changed it to lower case now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    Merged into master, thanks for all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by actuaryzhang <gi...@git.apache.org>.
Github user actuaryzhang commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    @felixcheung @yanboliang @viirya


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18122
  
    **[Test build #77440 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77440/testReport)** for PR 18122 at commit [`c3f4430`](https://github.com/apache/spark/commit/c3f44303636654232347af38841e5347a63a860f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org