You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mengxr <gi...@git.apache.org> on 2016/05/02 16:08:35 UTC

[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

GitHub user mengxr opened a pull request:

    https://github.com/apache/spark/pull/12843

    [SPARK-14050] [ML] Add multiple languages support and additional methods for Stop Words Remover

    ## What changes were proposed in this pull request?
    
    This PR continues the work from #11871 with the following changes:
    * load English stopwords as default
    * covert stopwords to list in Python
    * update some tests and doc
    
    ## How was this patch tested?
    
    Unit tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mengxr/spark SPARK-14050

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12843.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12843
    
----
commit c126c87818eb06aa5c2ac23b362d504f342c72b0
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-14T22:22:02Z

    add language files

commit 8248579ec27a40de98fe1f3020d947c478981ebc
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-14T22:23:32Z

    add multi-language support for stop words

commit 2c7b73df14d2d292eff88d7f3c358d29f82f6122
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-14T22:24:41Z

    add new tests for StopWordsRemover

commit 43e5cf54d4f9583f8b90291b3c7603ac4e7fab2a
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-21T23:41:47Z

    adjust resource files

commit a43039223a28b308ae1c14d33be5e5a1df382ed6
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-21T23:43:15Z

    adjust resource files

commit 28ee249f676971371d11d16c2912bbf81e045269
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-21T23:46:42Z

    fix stopwords bug

commit 6d215b31a205c4a79e8cc0ef6963d239941e80ff
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-21T23:53:06Z

    update comment lines

commit 6deceecf88c66b3293698aca5d7306c2aa02e2e0
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-22T16:24:38Z

    update stop words list

commit 41cd25815af3baa8fe9ed9336812f436d7ed7bd5
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-22T16:25:36Z

    update stopwordsremover

commit 4d1812aae64b0b15312940b1a6c42e19f9686480
Author: Burak KOSE <bu...@gmail.com>
Date:   2016-03-22T17:35:37Z

    fix test case bug
    
    After updating English stop words list, "d" is a stop word.

commit a30862231c3944c55c96cc94e162f61614aee6d5
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-22T21:45:48Z

    fix encoding

commit 2e7c54e5c17e7c5672a43ffc28acb207e94bf28a
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-23T01:42:36Z

    fix pyspark test

commit 7efda40e39663deef0b0884a7bfca13b5d10d706
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-23T16:51:48Z

    add licence for stop words list

commit a066e8b34ec4824fa26a1e306e197b66400f5ccb
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-24T17:12:20Z

    change licence to license

commit d0f43ace892332dfb3ad25d0ef1d0c0451540e5c
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-25T16:23:37Z

    add readme for stopwords list

commit c017ee235287554e28281d1691d0188e358b7ad8
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-25T16:26:23Z

    merge StopWords into StopWordsRemover

commit 55191ce1f449bed55884a4481071b0fc5ee776a9
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-25T16:27:59Z

    add python stopwords support for language selection

commit 789342f2d26759db180868a9f59b02c8f85cc835
Author: Burak Köse <bu...@gmail.com>
Date:   2016-03-25T16:28:48Z

    add new tests for stopwords

commit 4f97c8d5a088595a23f7ec848c793d05fc052d79
Author: Xiangrui Meng <me...@databricks.com>
Date:   2016-05-02T15:26:29Z

    Merge remote-tracking branch 'apache/master' into SPARK-14050

commit 713d4d5e81b2194efa640ec46fa16c56049c00f5
Author: Xiangrui Meng <me...@databricks.com>
Date:   2016-05-02T15:51:31Z

    minor updates

commit 1bd69af46f43d25518f6c5e01e2ee7fc5c279a03
Author: Xiangrui Meng <me...@databricks.com>
Date:   2016-05-02T16:05:52Z

    fix python tests and add a TODO

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216941527
  
    **[Test build #57769 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57769/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216313085
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57536/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217062799
  
    LGTM pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216305741
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216912795
  
    **[Test build #57769 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57769/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216941751
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57769/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216281191
  
    **[Test build #57536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57536/consoleFull)** for PR 12843 at commit [`9f488fb`](https://github.com/apache/spark/commit/9f488fb606315be627ce6e93a15e7a8eda70467f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217557000
  
    Merged into master and branch-2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216312808
  
    **[Test build #57536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57536/consoleFull)** for PR 12843 at commit [`9f488fb`](https://github.com/apache/spark/commit/9f488fb606315be627ce6e93a15e7a8eda70467f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216305512
  
    **[Test build #57535 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57535/consoleFull)** for PR 12843 at commit [`42b54ca`](https://github.com/apache/spark/commit/42b54ca19e9c8bb6e34bb55da05a003b804ff4e6).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217062406
  
    **[Test build #2974 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2974/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217073574
  
    **[Test build #2974 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2974/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217306230
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57923/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216650728
  
    Should there be a unit tests which iterates through StopWordsRemover.supportedLanguages and tests loading all & checking they are non-empty?
    
    Other than those small items, this looks good to me


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217306227
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/12843


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216305742
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57535/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216279967
  
    **[Test build #57535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57535/consoleFull)** for PR 12843 at commit [`42b54ca`](https://github.com/apache/spark/commit/42b54ca19e9c8bb6e34bb55da05a003b804ff4e6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12843#discussion_r61947349
  
    --- Diff: python/pyspark/ml/feature.py ---
    @@ -1763,28 +1763,22 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl
                               "comparison over the stop words", typeConverter=TypeConverters.toBoolean)
     
         @keyword_only
    -    def __init__(self, inputCol=None, outputCol=None, stopWords=None,
    -                 caseSensitive=False):
    +    def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False):
             """
    -        __init__(self, inputCol=None, outputCol=None, stopWords=None,\
    -                 caseSensitive=false)
    +        __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false)
             """
             super(StopWordsRemover, self).__init__()
             self._java_obj = self._new_java_obj("org.apache.spark.ml.feature.StopWordsRemover",
                                                 self.uid)
    -        stopWordsObj = _jvm().org.apache.spark.ml.feature.StopWords
    -        defaultStopWords = list(stopWordsObj.English())
    -        self._setDefault(stopWords=defaultStopWords, caseSensitive=False)
    +        self._setDefault(stopWords=StopWordsRemover.loadStopWords("english"), caseSensitive=False)
             kwargs = self.__init__._input_kwargs
             self.setParams(**kwargs)
     
         @keyword_only
         @since("1.6.0")
    -    def setParams(self, inputCol=None, outputCol=None, stopWords=None,
    -                  caseSensitive=False):
    +    def setParams(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False):
             """
    -        setParams(self, inputCol="input", outputCol="output", stopWords=None,\
    -                  caseSensitive=false)
    +        setParams(self, inputCol="input", outputCol="output", stopWords=None, caseSensitive=false)
    --- End diff --
    
    doc: inputCol, outputCol should not have defaults


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216941748
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217306053
  
    **[Test build #57923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57923/consoleFull)** for PR 12843 at commit [`df2d98f`](https://github.com/apache/spark/commit/df2d98f6951c360c950c6c8c5625f9f8d7ec95bf).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12843#discussion_r61947340
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala ---
    @@ -110,37 +60,37 @@ class StopWordsRemover(override val uid: String)
       def getStopWords: Array[String] = $(stopWords)
     
       /**
    -   * whether to do a case sensitive comparison over the stop words
    +   * Whether to do a case sensitive comparison over the stop words.
        * Default: false
        * @group param
        */
       val caseSensitive: BooleanParam = new BooleanParam(this, "caseSensitive",
    -    "whether to do case-sensitive comparison during filtering")
    +    "whether to do a case-sensitive comparison over the stop stop words")
    --- End diff --
    
    "stop stop" --> "stop"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-217282806
  
    **[Test build #57923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57923/consoleFull)** for PR 12843 at commit [`df2d98f`](https://github.com/apache/spark/commit/df2d98f6951c360c950c6c8c5625f9f8d7ec95bf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12843#issuecomment-216313082
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org