You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2022/01/13 12:35:38 UTC

[GitHub] [superset] simonvanderveldt opened a new issue #18023: Queries are changed before being sent to spark/Spark hints are stripped from query

simonvanderveldt opened a new issue #18023:
URL: https://github.com/apache/superset/issues/18023


   We're using superset on top of the Spark Thriftserver. We noticed that the queries we enter are being changed before they are sent to Spark. For example [Spark hints](https://spark.apache.org/docs/3.2.0/sql-ref-syntax-qry-select-hints.html) that we included in the query in superset don't make it to the Spark Thriftserver. Running the same query against the Spark thriftserver using beeline works as expected and the hints make it to/are shown in the Spark Thriftserver and are applied to the execution plan.
   
   #### How to reproduce the bug
   - Run superset on top of Spark Thriftserver, config like so `hive://hive@{hostname}:{port}/{database}` as described in the docs https://superset.apache.org/docs/databases/spark-sql
   - Run a query like
   ```
   SELECT /*+ BROADCAST(Table 2) */ COLUMN
   FROM Table 1 join Table 2
   on Table1.key= Table2.key
   ```
   - Check the query in the SQL tab of the Spark Thriftserver and notice the `/*+ BROADCAST(Table 2) */` part is missing
   - Check the execution plan and notice the broadcast isn't there
   - Rinse and repeat but use beeline instead of superset and see it working
   
   ### Expected results
   Queries aren't changed before being sent to the underlying query engine.
   
   ### Actual results
   Query is changed before being sent to the underlying query engine and an important part of the query is lost.
   I did a brief search trying to find why/where this is happening in the code but couldn't find it.
   
   ### Environment
   - Superset 1.3.2
   - Spark 3.2.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org