You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nihar Sheth (JIRA)" <ji...@apache.org> on 2018/08/24 21:52:00 UTC

[jira] [Commented] (SPARK-25230) Upper behavior incorrect for string contains "ß"

    [ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592216#comment-16592216 ] 

Nihar Sheth commented on SPARK-25230:
-------------------------------------

This seems to be a JVM thing  [https://docs.oracle.com/javase/6/docs/api/java/lang/String.html#toUpperCase%28java.util.Locale%29] All locales will switch it to SS in Java/Scala

From what I've quickly checked, mysql, postgresql, and sqlite all do not change the character, but spark-sql and websql change to SS. If it's essential to fix, it might just come down to replacing it with a placeholder value, performing the uppercasing, then substituting it back in.

> Upper behavior incorrect for string contains "ß"
> ------------------------------------------------
>
>                 Key: SPARK-25230
>                 URL: https://issues.apache.org/jira/browse/SPARK-25230
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: Yuming Wang
>            Priority: Major
>         Attachments: MySQL.png, Oracle.png, Teradata.jpeg
>
>
> How to reproduce:
> {code:sql}
> spark-sql> SELECT upper('Haßler');
> HASSLER
> {code}
> Mainstream databases returns {{HAßLER}}.
>  !MySQL.png!
>  
> This behavior may lead to data inconsistency:
> {code:sql}
> create temporary view SPARK_25230 as select * from values
>   ("Hassler"),
>   ("Haßler")
> as EMPLOYEE(name);
> select UPPER(name) from SPARK_25230 group by 1;
> -- result
> HASSLER{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org