You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/07/14 07:44:35 UTC

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25133: [SPARK-28365][Python][TEST] Set default locale for StopWordsRemover tests to prevent invalid locale error during test

HyukjinKwon commented on a change in pull request #25133: [SPARK-28365][Python][TEST] Set default locale for StopWordsRemover tests to prevent invalid locale error during test
URL: https://github.com/apache/spark/pull/25133#discussion_r303231722
 
 

 ##########
 File path: python/pyspark/ml/feature.py
 ##########
 @@ -2612,6 +2612,8 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl
 
     .. note:: null values from input array are preserved unless adding null to stopWords explicitly.
 
+    >>> locale = spark._jvm.java.util.Locale
+    >>> locale.setDefault(locale.forLanguageTag("en-US")) # Set a default local
 
 Review comment:
   Hmmm .. @viirya. Actually, wouldn't we better make it working? Seems like if we have default locales not available in JVM, it always fails (not only the test but this API itself).
   
   So, looks always they have to manually change system locale or using this current way via accessing a private property `_jvm` in PySpark.
   
   Wouldn't we maybe better just fallback to US locale by default with a warning?
   
   For instance .. 
   
   ```scala
     private val getDefaultOrUS: String = {
       if (Locale.getAvailableLocales.map(_.toString).contains(Locale.getDefault.toString)) {
         Locale.getDefault.toString
       } else {
         logWarning(s"Default locale set was [${Locale.getDefault.toString}]; however, it was " +
           "not found in available locales in JVM, falling back to es-US locale. Set locale " +
           "in order to respect another locale.")
         Locale.US.toString
       }
     }
     setDefault(stopWords -> StopWordsRemover.loadDefaultStopWords("english"),
       caseSensitive -> false, locale -> getDefaultOrUS)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org