You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/02/02 19:23:44 UTC

[GitHub] agrawaldevesh edited a comment on issue #6719: Adding support for Pinot

agrawaldevesh edited a comment on issue #6719: Adding support for Pinot
URL: https://github.com/apache/incubator-superset/pull/6719#issuecomment-459944452
 
 
   @betodealmeida, Thanks for the comments and the suggestions. I tried them but it won't work :(
   
   The problem is that Pinot does not like having grouping expressions in the select clause. A query like this will give some PQL NullPointerException:
   
   SELECT a as A, 
          DATETIMECONVERT(t, "1:SECONDS:EPOCH", "1:SECONDS:EPOCH", "1:DAYS") as __timestamp,
          count(1) as count
   FROM XXX
   WHERE t >= 1546382757
     AND t <= 1549061157
     AND state = 'FAILED'
     AND eventType = 'complete'
   GROUP BY a,
            DATETIMECONVERT(t, "1:SECONDS:EPOCH", "1:SECONDS:EPOCH", "1:DAYS")
   TOP 10. 
   
   It is okay with the "select a as A" part but not by the time function in the select clause. ie, I have to rewrite this query like this:
   
   SELECT a as A,
          count(1) as count
   FROM XXX
   WHERE t >= 1546382757
     AND t <= 1549061157
     AND state = 'FAILED'
     AND eventType = 'complete'
   GROUP BY a,
            DATETIMECONVERT(t, "1:SECONDS:EPOCH", "1:SECONDS:EPOCH", "1:DAYS")
   TOP 10. 
   
   But note that we want the timestamp field to show up as a special column name __timestamp in the visualization, and the group-by cannot take a column alias (ie I cannot do "group by a, blah as foo"). 
   
   Which means that I somehow need to tell the visualization code which column is __timestamp and stamp that in so.
   
   I could hack this in another way by keeping track of the __timestamp column only and stamping that in, but no matter what I do, I need some way to "rename" something in the returned dataframe. 
   
   The real way to fix this would be to fix this in Pinot itself. Were it not for this parse error, we can have the "select X as Y" in the query and then use the moz-sql-parser to get the aliases and stamp them back. But as it stands now, this is not this is not doable :(. (I can work on that on the pinot side that would take a while to get pushed through and I think that some pinot support rather than none is better until then)
   
   I decided against creating a PQL string that I then reparse and change under the covers before giving it to Pinot: I would like the generated query to be something that the user can take it and directly do a REST query to Pinot and it should work. It shouldn't be a Pinot-QL dialect that is private to Superset. 
   
   The labels_expected aspect is local to the visualization logic: it expects the dataframe to have the __timestamp column to identify the time dimension. Alternatively, we can enforce that the time column is always the first column or such but that would be a bigger change everywhere. 
   
   Now, as for the Explore to SQL-Lab exporting is concerned: All that will happen here is that the column names in the SQL-lab will show up as different from what the visualization wants them to be. I think that is okay ? The user can always choose not to expose the Pinot database in the SQL-Lab if he finds that annoying. We can say in the docs that the pinot support is experimental and clearly state the issues with the SQL-lab column naming issue.
   
   Thanks for the review and let me know if you have a suggestion out of this conundrum.
   
   I removed the hack for the num_metrics_expr and instead am fixing it in the pinot-dbapi in this PR: https://github.com/betodealmeida/pinot-dbapi/pull/4

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org