You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Adam Tamas (Jira)" <ji...@apache.org> on 2020/09/04 10:46:00 UTC

[jira] [Created] (IMPALA-10145) UnicodeDecodeError in Thrift 0.11.0 generated files

Adam Tamas created IMPALA-10145:
-----------------------------------

             Summary: UnicodeDecodeError in Thrift 0.11.0 generated files
                 Key: IMPALA-10145
                 URL: https://issues.apache.org/jira/browse/IMPALA-10145
             Project: IMPALA
          Issue Type: Bug
            Reporter: Adam Tamas


If there is a string with undecodable characters in the query results, then an error will happen during the fetching while thrift 0.11.0 generated python files were in use which results in an UnicodeDecodeError.

Depending on which protocol is in use with the impala-shell, the error will happen in different places.
Examples for hs2-http and hs2 protocolls:
{code:java}
[localhost:28000] default> select unhex('aa');
Query: select unhex('aa')
Query submitted at: 2020-09-04 12:41:14 (Coordinator: http://tadam-OptiPlex-7070:25000)
Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=d041ab999f597fec:46a8b51800000000
Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. 
Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
Traceback (most recent call last):
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt
    for rows in rows_fetched:
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch
    resp = self._do_hs2_rpc(FetchResults)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc
    return rpc()
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults
    return self.imp_service.FetchResults(req)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults
    return self.recv_FetchResults()
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults
    result.read(iprot)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3593, in read
    self.success.read(iprot)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 5888, in read
    self.results.read(iprot)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2670, in read
    _elem115.read(iprot)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2556, in read
    self.stringVal.read(iprot)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2352, in read
    _elem95 = iprot.readString().decode('utf-8') if sys.version_info[0] == 2 else iprot.readString()
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
[Not connected] > 
{code}

{code:java}
[localhost:21050] default> select unhex('aa');
Query: select unhex('aa')
Query submitted at: 2020-09-04 12:42:22 (Coordinator: http://tadam-OptiPlex-7070:25000)
Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=3a481e2a0581ea7c:a6e1901800000000
Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. 
Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
Traceback (most recent call last):
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt
    for rows in rows_fetched:
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch
    resp = self._do_hs2_rpc(FetchResults)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc
    return rpc()
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults
    return self.imp_service.FetchResults(req)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults
    return self.recv_FetchResults()
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults
    result.read(iprot)
  File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3583, in read
    iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org