You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/05/08 17:18:27 UTC

[GitHub] [incubator-superset] betodealmeida commented on issue #7416: feat: Scheduling queries from SQL Lab

betodealmeida commented on issue #7416: feat: Scheduling queries from SQL Lab
URL: https://github.com/apache/incubator-superset/pull/7416#issuecomment-490573647
 
 
   Hi, @vnourdin!
   
   On the Superset side, I tried to make this as agnostic as possible. The example config is meant for Airflow, but by changing the config you can add any metadata you need to successfully schedule a query.
   
   At Lyft we're prototyping this with Hive, and later we'll add support to Presto. We're also planning to add an option to upload the data to our Druid database. My co-worker @argentfalcon wrote the Airflow pipeline, and I'll see if he can share it with you. 
   
   For our proof-of-concept we expect from the user a query filtered  by `ds` and a table name. The table will be created if it not exists, partitioned by `ds`. Everyday we run the query and insert the results into the corresponding partition, incrementally building the results. For now, this is the only supported workflow.
   
   We also support (or will support) depending on a daily partition of a Hive table or on another Airflow DAG. The way we do that is by adding dependencies (see the modal) like `hive://schema.table` or `airflow://dag_id`.
   
   It should be easy to support different workflows. You could add a checkbox asking the user if this is a full or incremental query, for example, and the Airflow DAG would work differently depending on the value.
   
   Hope this helps!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org