You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Philip (JIRA)" <ji...@apache.org> on 2019/08/02 12:47:00 UTC
[jira] [Commented] (IMPALA-2019) Proper UTF-8 support in string
functions
[ https://issues.apache.org/jira/browse/IMPALA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898861#comment-16898861 ]
Philip commented on IMPALA-2019:
--------------------------------
Also String lengths seem to be an issue.
It appears to return the *byte length* rather than the *number of characters*.
I would suggest this is +not a minor issue+.
{color:#205081} *select length('€')* {color}
In Hive returns 1
In Impala returns 3
> Proper UTF-8 support in string functions
> ----------------------------------------
>
> Key: IMPALA-2019
> URL: https://issues.apache.org/jira/browse/IMPALA-2019
> Project: IMPALA
> Issue Type: New Feature
> Components: Backend
> Affects Versions: Impala 2.1, Impala 2.2
> Reporter: Andrés Cordero
> Priority: Minor
> Labels: sql-language
>
> As documented here: http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_string.html
> Impala does not properly handle non-ASCII UTF-8 characters, and will return results in string functions such as length that are inconsistent with Hive.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org