You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/02/20 09:31:00 UTC

[jira] [Created] (IMPALA-10523) Impala-shell crash in printing error messages that contain UTF-8 characters

Quanlong Huang created IMPALA-10523:
---------------------------------------

             Summary: Impala-shell crash in printing error messages that contain UTF-8 characters
                 Key: IMPALA-10523
                 URL: https://issues.apache.org/jira/browse/IMPALA-10523
             Project: IMPALA
          Issue Type: Bug
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


Encounter a crash in impala-shell when playing around with a query:
{code}
[localhost:21050] default> select cast(now() as string format 'yyyy年MM月dd日');
Query: select cast(now() as string format 'yyyy年MM月dd日')
Query submitted at: 2021-02-20 16:40:09 (Coordinator: http://quanlong-OptiPlex-BJ:25000)
Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=6c4d64dec01254bc:d54107fd00000000
Traceback (most recent call last):
  File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 2070, in <module>
    impala_shell_main()
  File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 2035, in impala_shell_main
    shell.cmdloop(intro)
  File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 697, in onecmd
    return func(arg)
  File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1123, in do_select
    return self._execute_stmt(query_str, print_web_link=True)
  File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1320, in _execute_stmt
    print(e, file=sys.stderr)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u5e74' in position 44: ordinal not in range(128)
{code}

The crash point is in shell/impala_shell.py:1320
{code:python}
1316     except QueryStateException as e:
1317       # an exception occurred while executing the query
1318       if self.last_query_handle is not None:
1319         self.imp_client.close_query(self.last_query_handle)
1320       print(e, file=sys.stderr)
{code}

Definition of QueryStateException in shell/shell_exceptions.py:
{code:python}
 28 class QueryStateException(Exception):
 29   def __init__(self, value=""):
 30     self.value = value
 31 
 32   def __str__(self):
 33     return self.value
{code}

After IMPALA-9489, '{{value}}' of QueryStateException is in unicode type when using Python2, because we follow the "unicode sandwich" manner - "bytes on the outside, unicode on the inside, encode/decode at the edges". We should encode it to str using 'utf-8' encoding, instead of letting Python do this implicitly and fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)