You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@superset.apache.org by gi...@git.apache.org on 2017/08/02 09:59:18 UTC
[GitHub] playcat opened a new issue #3226: Issues running a Hive query - on aggregated data
playcat opened a new issue #3226: Issues running a Hive query - on aggregated data
URL: https://github.com/apache/incubator-superset/issues/3226
I'm seeing very strange results on my virtual machine (Could it be resources?) when running query to show aggregated visits and revenue from one of our server logs. Everything works fine when running it without aggregation (i.e. it's not a connection issue or something alike). The same query works fine when run from tools like SQL Developer.
### Superset version
Superset 0.18.5
### Expected results
2017-07-21 100 11.002
2017-07-21 321 21.21
2017-07-23 21 0.32
2017-07-24 212 14.44
(not real data)
### Actual results
The following appeares in web interface, where data should be:
{"status": "failed", "query_id": 42, "error_essage": ""}
In terminal output, shell:
2017-08-02 11:43:40,916:INFO:pyhive.hive:SELECT
date,
SUM(1) AS visits,
SUM(price) AS revenue
FROM
visits
WHERE
date BETWEEN '2017-07-21' AND '2017-07-24'
AND country_code = 'US'
AND status = 'OK'
GROUP BY
date
ORDER BY
date
2017-08-02 11:43:40,916:DEBUG:pyhive.hive:TExecuteStatementReq(confOverlay=None, sessionHandle=TSessionHandle(sessionId=THandleIdentifier(secret='C\xde\x8c\x00\x9au@\xe6\xb0\xe9\x9d\xd6\xf9\xd7\xd9\xc7', guid=' a\xff\x92J\xfa@\xe5\x9d\x942\x8b\x19\xb3\x05#')), runAsync=True, statement=u"SELECT \n date,\n SUM(1) AS visits,\n SUM(price) AS revenue\nFROM\n visits \nWHERE\n date BETWEEN '2017-07-21' AND '2017-07-24'\n AND country_code = 'US'\n AND status = 'OK'\nGROUP BY\n date\nORDER BY \n date")
2017-08-02 11:43:41,051:INFO:root:[stats_logger] (incr) queries
2017-08-02 11:43:41,090:DEBUG:pyhive.hive:TExecuteStatementResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='e\x93\xc5\xc0F\xa1F\xa4\xb98X6\xec\xbc\x00\xcc', guid='\x16\xe8\xa7\xb0--Nu\xa9\x83W\x1c\xf4\x06Py')))
2017-08-02 11:43:41,090:INFO:root:Handling cursor
2017-08-02 11:43:43,067:INFO:root:[stats_logger] (incr) queries
2017-08-02 11:43:45,058:INFO:root:[stats_logger] (incr) queries
2017-08-02 11:43:46,262:DEBUG:pyhive.hive:TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=1, errorMessage=None, sqlState=None, errorCode=None)
2017-08-02 11:43:46,607:DEBUG:pyhive.hive:TFetchResultsResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), results=TRowSet(rows=[], columns=[TColumn(i32Val=None, byteVal=None, i16Val=None, i64Val=None, stringVal=TStringColumn(nulls='\x00', values=[]), boolVal=None, doubleVal=None, binaryVal=None)], startRowOffset=0), hasMoreRows=False)
2017-08-02 11:43:46,608:ERROR:root:
Traceback (most recent call last):
File "/home/runner/venv/local/lib/python2.7/site-packages/superset/sql_lab.py", line 182, in execute_sql
db_engine_spec.handle_cursor(cursor, query, session)
File "/home/runner/venv/local/lib/python2.7/site-packages/superset/db_engine_specs.py", line 726, in handle_cursor
resp = cursor.fetch_logs()
File "/home/runner/venv/local/lib/python2.7/site-packages/superset/db_engines/hive.py", line 34, in fetch_logs
response.results.rows, 'expected data in columnar format'
AssertionError
### Steps to reproduce
1. Select a hive connector
2. type query
3. run the query
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services