You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Jim Apple (Jira)" <ji...@apache.org> on 2019/12/01 01:44:00 UTC

[jira] [Resolved] (IMPALA-9205) UDF function in impala recieved chinese character change to???

     [ https://issues.apache.org/jira/browse/IMPALA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Apple resolved IMPALA-9205.
-------------------------------
    Resolution: Duplicate

Impala does not support non-ASCII string functions: IMPALA-2019

> UDF function in impala recieved chinese character change to???
> --------------------------------------------------------------
>
>                 Key: IMPALA-9205
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9205
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 2.12.0
>         Environment: CentOS 7.3
> Hive 1.2
> Impala 2.12
> Java JDK 1.8
> Python 2.7.5
>            Reporter: Moxuan Shi
>            Priority: Major
>
> UDF works in hive, but not in impala.
>  
> {code:java}
> select leftcutcontentudf("一二三",2);
> OK
> 一二
> {code}
> [work in hive|https://i.stack.imgur.com/pdCzU.png]
>  
> {code:java}
> select leftcutcontentudf("一二三",2);
> +----------------------------------------+
> | default.leftcutcontentudf('一二三', 2) |
> +----------------------------------------+
> | ??                                     |
> +----------------------------------------+
> {code}
> [chinese character changed to ?? in impala|https://i.stack.imgur.com/QU5Gx.png]
>  
> I make a new UDF to print byte for input String
> {code:java}
> public class GetBytes extends UDF {
>     public String evaluate(String input) {
>         byte[] bytes = input.getBytes();
>         StringBuffer stringBuffer = new StringBuffer();
>         for (byte b : bytes){
>             stringBuffer.append(b).append(" ");
>         }
>         return stringBuffer.toString();
>     }
> }
> {code}
> it seems that the chinese character changed to ??? before calling UDF function.
> {code:java}
> select getbytes("一二三");
> {code}
> {code:java}
> +-----------------------------+ 
> | default.getbytes('一二三') | 
> +-----------------------------+ 
> | 63 63 63 63 63 63 63 63 63 | 
> +-----------------------------+
> {code}
> [GetBytes result|https://i.stack.imgur.com/wVHT6.png]
>  
> but normal query is correct in impala.
> {code:java}
> select khmc_62c57e8ae0ac from collective_2085;
> +-------------------+
> | khmc_62c57e8ae0ac |
> +-------------------+
> | 淘宝              |
> +-------------------+
> {code}
> [correct query|https://i.stack.imgur.com/euq79.png]
> how to deal with this problem?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)