You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@superset.apache.org by gi...@git.apache.org on 2017/08/02 09:59:18 UTC

[GitHub] playcat opened a new issue #3226: Issues running a Hive query - on aggregated data

playcat opened a new issue #3226: Issues running a Hive query - on aggregated data
URL: https://github.com/apache/incubator-superset/issues/3226
 
 
   I'm seeing very strange results on my virtual machine (Could it be resources?) when running query to show aggregated visits and revenue from one of our server logs. Everything works fine when running it without aggregation (i.e. it's not a connection issue or something alike). The same query works fine when run from tools like SQL Developer.
   
   ### Superset version
   Superset 0.18.5
   
   ### Expected results
   2017-07-21 100 11.002
   2017-07-21 321 21.21
   2017-07-23 21 0.32
   2017-07-24 212 14.44
   
   (not real data)
   
   ### Actual results
   The following appeares in web interface, where data should be:
   {"status": "failed", "query_id": 42, "error_essage": ""}
   In terminal output, shell:
   2017-08-02 11:43:40,916:INFO:pyhive.hive:SELECT 
       date,
       SUM(1) AS visits,
       SUM(price) AS revenue
   FROM
       visits 
   WHERE
       date BETWEEN '2017-07-21' AND '2017-07-24'
       AND country_code = 'US'
       AND status = 'OK'
   GROUP BY
       date
   ORDER BY 
       date
   2017-08-02 11:43:40,916:DEBUG:pyhive.hive:TExecuteStatementReq(confOverlay=None, sessionHandle=TSessionHandle(sessionId=THandleIdentifier(secret='C\xde\x8c\x00\x9au@\xe6\xb0\xe9\x9d\xd6\xf9\xd7\xd9\xc7', guid=' a\xff\x92J\xfa@\xe5\x9d\x942\x8b\x19\xb3\x05#')), runAsync=True, statement=u"SELECT \n    date,\n    SUM(1) AS visits,\n    SUM(price) AS revenue\nFROM\n    visits \nWHERE\n    date BETWEEN '2017-07-21' AND '2017-07-24'\n    AND country_code = 'US'\n    AND status = 'OK'\nGROUP BY\n    date\nORDER BY \n    date")
   2017-08-02 11:43:41,051:INFO:root:[stats_logger] (incr) queries
   2017-08-02 11:43:41,090:DEBUG:pyhive.hive:TExecuteStatementResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='e\x93\xc5\xc0F\xa1F\xa4\xb98X6\xec\xbc\x00\xcc', guid='\x16\xe8\xa7\xb0--Nu\xa9\x83W\x1c\xf4\x06Py')))
   2017-08-02 11:43:41,090:INFO:root:Handling cursor
   2017-08-02 11:43:43,067:INFO:root:[stats_logger] (incr) queries
   2017-08-02 11:43:45,058:INFO:root:[stats_logger] (incr) queries
   2017-08-02 11:43:46,262:DEBUG:pyhive.hive:TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=1, errorMessage=None, sqlState=None, errorCode=None)
   2017-08-02 11:43:46,607:DEBUG:pyhive.hive:TFetchResultsResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), results=TRowSet(rows=[], columns=[TColumn(i32Val=None, byteVal=None, i16Val=None, i64Val=None, stringVal=TStringColumn(nulls='\x00', values=[]), boolVal=None, doubleVal=None, binaryVal=None)], startRowOffset=0), hasMoreRows=False)
   2017-08-02 11:43:46,608:ERROR:root:
   Traceback (most recent call last):
     File "/home/runner/venv/local/lib/python2.7/site-packages/superset/sql_lab.py", line 182, in execute_sql
       db_engine_spec.handle_cursor(cursor, query, session)
     File "/home/runner/venv/local/lib/python2.7/site-packages/superset/db_engine_specs.py", line 726, in handle_cursor
       resp = cursor.fetch_logs()
     File "/home/runner/venv/local/lib/python2.7/site-packages/superset/db_engines/hive.py", line 34, in fetch_logs
       response.results.rows, 'expected data in columnar format'
   AssertionError
   
   
   ### Steps to reproduce
   1. Select a hive connector
   2. type query
   3. run the query
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services