You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zhengruifeng <gi...@git.apache.org> on 2018/11/26 10:13:30 UTC

[GitHub] spark pull request #23144: [SPARK-26172][ML][WIP] Unify String Params' case-...

GitHub user zhengruifeng opened a pull request:

    https://github.com/apache/spark/pull/23144

    [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

    ## What changes were proposed in this pull request?
    1, methods `lowerCaseInArray` and `upperCaseInArray` are created in `ParamValidators` to check case-insensitivity;
    2, methods `$$(param: Param[String])` and `%%(param: Param[String])` are created in trait Params to lower/upper the param value conveniently;
    3, in `SharedParamsCodeGen`, `handleInvalid` and `distanceMeasure` are updated to use  `lowerCaseInArray`;
    4, make string params (except colnames) in ml case-insensitive
    
    ## How was this patch tested?
    updated suites

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zhengruifeng/spark case_insensitive_params

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23144.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23144
    
----
commit 85a5fcb5d7d20b3adc97c59b726c941ef071d5a6
Author: zhengruifeng <ru...@...>
Date:   2018-11-05T10:33:34Z

    init

commit e8f0cb40a822c124d7bba03765064bba8725315b
Author: zhengruifeng <ru...@...>
Date:   2018-11-22T10:20:13Z

    fix conflict

commit 8be289ebb5e0118669514e3e2c531623ed698b4e
Author: zhengruifeng <ru...@...>
Date:   2018-11-23T03:19:22Z

    use 2418 as lowercase

commit 810d8556beef0a27bd39723c2e5733da1d546621
Author: zhengruifeng <ru...@...>
Date:   2018-11-23T03:28:32Z

    update CodeGen

commit e55244afa41e959f99c02f1afbe916fc7d1ffec3
Author: zhengruifeng <ru...@...>
Date:   2018-11-26T09:17:18Z

    update suites

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    **[Test build #99307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99307/testReport)** for PR 23144 at commit [`27429fb`](https://github.com/apache/spark/commit/27429fb5ae95d9ee68dc0f0a769fd3412c54ebfc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5426/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99307/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    @srowen  To adopt an optional `normalize` function argument, we may need to create a new class `StringParam` and add the argument into it. But this will be a breaking change, since existing string params are of type `Param[String]`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5357/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    **[Test build #99349 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99349/testReport)** for PR 23144 at commit [`f46b6b7`](https://github.com/apache/spark/commit/f46b6b7ab82e9de39f888cd40e2b1904ae4df73a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    **[Test build #99349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99349/testReport)** for PR 23144 at commit [`f46b6b7`](https://github.com/apache/spark/commit/f46b6b7ab82e9de39f888cd40e2b1904ae4df73a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99274/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Using an optional `normalize` function argument maybe OK, I will have a try.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by zhengruifeng <gi...@git.apache.org>.
Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    I am not sure about `$$` or `%%`, we can replace them with other names.
    I want to resolve the confusion of case-insensitivity, and wonder whether a new flag can do this.
    If we want to keep the return of getter identical to value passed to setter, we will need two version of 'getter', one to return the original value, the other to return the lower/upper value.
    Another issue is that, there is no `StringParam` trait, so we have to modify `Param` trait directly.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    **[Test build #99274 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99274/testReport)** for PR 23144 at commit [`e55244a`](https://github.com/apache/spark/commit/e55244afa41e959f99c02f1afbe916fc7d1ffec3).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    **[Test build #99307 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99307/testReport)** for PR 23144 at commit [`27429fb`](https://github.com/apache/spark/commit/27429fb5ae95d9ee68dc0f0a769fd3412c54ebfc).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99349/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5390/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23144
  
    **[Test build #99274 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99274/testReport)** for PR 23144 at commit [`e55244a`](https://github.com/apache/spark/commit/e55244afa41e959f99c02f1afbe916fc7d1ffec3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org