You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Jim Apple (Jira)" <ji...@apache.org> on 2019/12/01 01:44:00 UTC
[jira] [Resolved] (IMPALA-9205) UDF function in impala recieved
chinese character change to???
[ https://issues.apache.org/jira/browse/IMPALA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Apple resolved IMPALA-9205.
-------------------------------
Resolution: Duplicate
Impala does not support non-ASCII string functions: IMPALA-2019
> UDF function in impala recieved chinese character change to???
> --------------------------------------------------------------
>
> Key: IMPALA-9205
> URL: https://issues.apache.org/jira/browse/IMPALA-9205
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 2.12.0
> Environment: CentOS 7.3
> Hive 1.2
> Impala 2.12
> Java JDK 1.8
> Python 2.7.5
> Reporter: Moxuan Shi
> Priority: Major
>
> UDF works in hive, but not in impala.
>
> {code:java}
> select leftcutcontentudf("一二三",2);
> OK
> 一二
> {code}
> [work in hive|https://i.stack.imgur.com/pdCzU.png]
>
> {code:java}
> select leftcutcontentudf("一二三",2);
> +----------------------------------------+
> | default.leftcutcontentudf('一二三', 2) |
> +----------------------------------------+
> | ?? |
> +----------------------------------------+
> {code}
> [chinese character changed to ?? in impala|https://i.stack.imgur.com/QU5Gx.png]
>
> I make a new UDF to print byte for input String
> {code:java}
> public class GetBytes extends UDF {
> public String evaluate(String input) {
> byte[] bytes = input.getBytes();
> StringBuffer stringBuffer = new StringBuffer();
> for (byte b : bytes){
> stringBuffer.append(b).append(" ");
> }
> return stringBuffer.toString();
> }
> }
> {code}
> it seems that the chinese character changed to ??? before calling UDF function.
> {code:java}
> select getbytes("一二三");
> {code}
> {code:java}
> +-----------------------------+
> | default.getbytes('一二三') |
> +-----------------------------+
> | 63 63 63 63 63 63 63 63 63 |
> +-----------------------------+
> {code}
> [GetBytes result|https://i.stack.imgur.com/wVHT6.png]
>
> but normal query is correct in impala.
> {code:java}
> select khmc_62c57e8ae0ac from collective_2085;
> +-------------------+
> | khmc_62c57e8ae0ac |
> +-------------------+
> | 淘宝 |
> +-------------------+
> {code}
> [correct query|https://i.stack.imgur.com/euq79.png]
> how to deal with this problem?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)