You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/08/21 22:47:42 UTC

[GitHub] [incubator-superset] browles opened a new issue #8084: Issue with non-utf-8 VARBINARY Presto columns and async queries

browles opened a new issue #8084: Issue with non-utf-8 VARBINARY Presto columns and async queries
URL: https://github.com/apache/incubator-superset/issues/8084
 
 
   Non-utf-8 VARBINARY columns cannot be (async) queried in SQL Lab. An older PR (https://github.com/apache/incubator-superset/pull/5121/files) implements a fix, but only affected sync queries specifically.
   
   A trivial fix is to add `encoding=None` (overriding the `"utf-8"` default) to https://github.com/apache/incubator-superset/blob/master/superset/sql_lab.py#L326.
   
   ### Expected results
   
   e.g. `b'\x12\xc0\x18\x1a\x15\x08\x01\x10...'`
   
   ### Actual results
   
   e.g. `UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 1: invalid start byte`
   
   ```
   2019/08/21 20:53:43 Traceback (most recent call last):
   2019/08/21 20:53:43   File "/mnt/venv/superset_worker/lib/python3.6/site-packages/celery/app/trace.py", line 385, in trace_task
   2019/08/21 20:53:43     R = retval = fun(*args, **kwargs)
   2019/08/21 20:53:43   File "/mnt/venv/superset_worker/lib/python3.6/site-packages/celery/app/trace.py", line 648, in __protected_call__
   2019/08/21 20:53:43     return self.run(*args, **kwargs)
   2019/08/21 20:53:43   File "/mnt/venv/superset_worker/lib/python3.6/site-packages/superset/sql_lab.py", line 89, in get_sql_results
   2019/08/21 20:53:43     session=session, start_time=start_time)
   2019/08/21 20:53:43   File "/mnt/venv/superset_worker/lib/python3.6/site-packages/superset/sql_lab.py", line 244, in execute_sql
   2019/08/21 20:53:43     payload, default=json_iso_dttm_ser, ignore_nan=True)
   2019/08/21 20:53:43   File "/mnt/venv/superset_worker/lib/python3.6/site-packages/simplejson/__init__.py", line 399, in dumps
   2019/08/21 20:53:43     **kw).encode(obj)
   2019/08/21 20:53:43   File "/mnt/venv/superset_worker/lib/python3.6/site-packages/simplejson/encoder.py", line 296, in encode
   2019/08/21 20:53:43     chunks = self.iterencode(o, _one_shot=True)
   2019/08/21 20:53:43   File "/mnt/venv/superset_worker/lib/python3.6/site-packages/simplejson/encoder.py", line 378, in iterencode
   2019/08/21 20:53:43     return _iterencode(o, 0)
   2019/08/21 20:53:43 UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 1: invalid start byte
   ```
   
   #### How to reproduce the bug
   
   Perform an async `select *` or similar on a table with non-utf-8 encoded VARBINARY columns.
   
   ### Environment
   
   - superset version: `Superset 0.28.1`
   - python version: `Python 3.6.8`
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org