You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mengxr <gi...@git.apache.org> on 2016/05/02 16:08:35 UTC
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/12843
[SPARK-14050] [ML] Add multiple languages support and additional methods for Stop Words Remover
## What changes were proposed in this pull request?
This PR continues the work from #11871 with the following changes:
* load English stopwords as default
* covert stopwords to list in Python
* update some tests and doc
## How was this patch tested?
Unit tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mengxr/spark SPARK-14050
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/12843.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12843
----
commit c126c87818eb06aa5c2ac23b362d504f342c72b0
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-14T22:22:02Z
add language files
commit 8248579ec27a40de98fe1f3020d947c478981ebc
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-14T22:23:32Z
add multi-language support for stop words
commit 2c7b73df14d2d292eff88d7f3c358d29f82f6122
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-14T22:24:41Z
add new tests for StopWordsRemover
commit 43e5cf54d4f9583f8b90291b3c7603ac4e7fab2a
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-21T23:41:47Z
adjust resource files
commit a43039223a28b308ae1c14d33be5e5a1df382ed6
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-21T23:43:15Z
adjust resource files
commit 28ee249f676971371d11d16c2912bbf81e045269
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-21T23:46:42Z
fix stopwords bug
commit 6d215b31a205c4a79e8cc0ef6963d239941e80ff
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-21T23:53:06Z
update comment lines
commit 6deceecf88c66b3293698aca5d7306c2aa02e2e0
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-22T16:24:38Z
update stop words list
commit 41cd25815af3baa8fe9ed9336812f436d7ed7bd5
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-22T16:25:36Z
update stopwordsremover
commit 4d1812aae64b0b15312940b1a6c42e19f9686480
Author: Burak KOSE <bu...@gmail.com>
Date: 2016-03-22T17:35:37Z
fix test case bug
After updating English stop words list, "d" is a stop word.
commit a30862231c3944c55c96cc94e162f61614aee6d5
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-22T21:45:48Z
fix encoding
commit 2e7c54e5c17e7c5672a43ffc28acb207e94bf28a
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-23T01:42:36Z
fix pyspark test
commit 7efda40e39663deef0b0884a7bfca13b5d10d706
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-23T16:51:48Z
add licence for stop words list
commit a066e8b34ec4824fa26a1e306e197b66400f5ccb
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-24T17:12:20Z
change licence to license
commit d0f43ace892332dfb3ad25d0ef1d0c0451540e5c
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-25T16:23:37Z
add readme for stopwords list
commit c017ee235287554e28281d1691d0188e358b7ad8
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-25T16:26:23Z
merge StopWords into StopWordsRemover
commit 55191ce1f449bed55884a4481071b0fc5ee776a9
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-25T16:27:59Z
add python stopwords support for language selection
commit 789342f2d26759db180868a9f59b02c8f85cc835
Author: Burak Köse <bu...@gmail.com>
Date: 2016-03-25T16:28:48Z
add new tests for stopwords
commit 4f97c8d5a088595a23f7ec848c793d05fc052d79
Author: Xiangrui Meng <me...@databricks.com>
Date: 2016-05-02T15:26:29Z
Merge remote-tracking branch 'apache/master' into SPARK-14050
commit 713d4d5e81b2194efa640ec46fa16c56049c00f5
Author: Xiangrui Meng <me...@databricks.com>
Date: 2016-05-02T15:51:31Z
minor updates
commit 1bd69af46f43d25518f6c5e01e2ee7fc5c279a03
Author: Xiangrui Meng <me...@databricks.com>
Date: 2016-05-02T16:05:52Z
fix python tests and add a TODO
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216941527
**[Test build #57769 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57769/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216313085
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57536/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217062799
LGTM pending tests
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216305741
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216912795
**[Test build #57769 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57769/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216941751
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57769/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216281191
**[Test build #57536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57536/consoleFull)** for PR 12843 at commit [`9f488fb`](https://github.com/apache/spark/commit/9f488fb606315be627ce6e93a15e7a8eda70467f).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217557000
Merged into master and branch-2.0.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216312808
**[Test build #57536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57536/consoleFull)** for PR 12843 at commit [`9f488fb`](https://github.com/apache/spark/commit/9f488fb606315be627ce6e93a15e7a8eda70467f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216305512
**[Test build #57535 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57535/consoleFull)** for PR 12843 at commit [`42b54ca`](https://github.com/apache/spark/commit/42b54ca19e9c8bb6e34bb55da05a003b804ff4e6).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217062406
**[Test build #2974 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2974/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217073574
**[Test build #2974 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2974/consoleFull)** for PR 12843 at commit [`e2d0aba`](https://github.com/apache/spark/commit/e2d0aba512fb2160656ce716a4f042b9a5dca032).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217306230
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57923/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216650728
Should there be a unit tests which iterates through StopWordsRemover.supportedLanguages and tests loading all & checking they are non-empty?
Other than those small items, this looks good to me
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217306227
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/12843
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216305742
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57535/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216279967
**[Test build #57535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57535/consoleFull)** for PR 12843 at commit [`42b54ca`](https://github.com/apache/spark/commit/42b54ca19e9c8bb6e34bb55da05a003b804ff4e6).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/12843#discussion_r61947349
--- Diff: python/pyspark/ml/feature.py ---
@@ -1763,28 +1763,22 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl
"comparison over the stop words", typeConverter=TypeConverters.toBoolean)
@keyword_only
- def __init__(self, inputCol=None, outputCol=None, stopWords=None,
- caseSensitive=False):
+ def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False):
"""
- __init__(self, inputCol=None, outputCol=None, stopWords=None,\
- caseSensitive=false)
+ __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false)
"""
super(StopWordsRemover, self).__init__()
self._java_obj = self._new_java_obj("org.apache.spark.ml.feature.StopWordsRemover",
self.uid)
- stopWordsObj = _jvm().org.apache.spark.ml.feature.StopWords
- defaultStopWords = list(stopWordsObj.English())
- self._setDefault(stopWords=defaultStopWords, caseSensitive=False)
+ self._setDefault(stopWords=StopWordsRemover.loadStopWords("english"), caseSensitive=False)
kwargs = self.__init__._input_kwargs
self.setParams(**kwargs)
@keyword_only
@since("1.6.0")
- def setParams(self, inputCol=None, outputCol=None, stopWords=None,
- caseSensitive=False):
+ def setParams(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False):
"""
- setParams(self, inputCol="input", outputCol="output", stopWords=None,\
- caseSensitive=false)
+ setParams(self, inputCol="input", outputCol="output", stopWords=None, caseSensitive=false)
--- End diff --
doc: inputCol, outputCol should not have defaults
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216941748
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217306053
**[Test build #57923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57923/consoleFull)** for PR 12843 at commit [`df2d98f`](https://github.com/apache/spark/commit/df2d98f6951c360c950c6c8c5625f9f8d7ec95bf).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/12843#discussion_r61947340
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala ---
@@ -110,37 +60,37 @@ class StopWordsRemover(override val uid: String)
def getStopWords: Array[String] = $(stopWords)
/**
- * whether to do a case sensitive comparison over the stop words
+ * Whether to do a case sensitive comparison over the stop words.
* Default: false
* @group param
*/
val caseSensitive: BooleanParam = new BooleanParam(this, "caseSensitive",
- "whether to do case-sensitive comparison during filtering")
+ "whether to do a case-sensitive comparison over the stop stop words")
--- End diff --
"stop stop" --> "stop"
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-217282806
**[Test build #57923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57923/consoleFull)** for PR 12843 at commit [`df2d98f`](https://github.com/apache/spark/commit/df2d98f6951c360c950c6c8c5625f9f8d7ec95bf).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-14050] [ML] Add multiple languages supp...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12843#issuecomment-216313082
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org