You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2021/09/08 07:57:35 UTC

[GitHub] [superset] graceguo-supercat opened a new issue #16631: [time comparison][perf] Duplicated queries from time comparison chart

graceguo-supercat opened a new issue #16631:
URL: https://github.com/apache/superset/issues/16631


   **Describe the solution you'd like**
   If I add time comparison for a time-series chart, Superset will send 2 queries to query engine, and execute them in sequence. For example:
   
   1. add `52-weeks` TIME SHIFT for `Growth Rate` sample chart:
   <img width="1249" alt="Screen Shot 2021-09-08 at 12 44 48 AM" src="https://user-images.githubusercontent.com/27990562/132467865-f9298964-22aa-4258-b1e7-22d5ffa2cf04.png">
   2. add some logging at [this line](https://github.com/apache/superset/blob/420dd5b94a6c38192208c39bad899aeaa6bb6dbb/superset/models/core.py#L413), you can see Superset sends 2 queries to engine:
   <img width="1216" alt="Screen Shot 2021-09-08 at 12 49 13 AM" src="https://user-images.githubusercontent.com/27990562/132468879-a8edb933-d54c-4fd2-805a-46a792628aad.png">
   
   The 2 queries are identical except with different time range. These 2 queries are sent in sequence, so query engine will execute them one by one, which mean the total query time for `Time comparison` chart is doubled.
   
   
   Is it possible to generate a single query for this `Time Comparison` feature? Or is it possible to send 2 queries in parallel, so that total query time for this chart could be reduced in half?
   
   
   cc @villebro @ktmud 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on issue #16631: [time comparison][perf] Duplicated queries from time comparison chart

Posted by GitBox <gi...@apache.org>.
villebro commented on issue #16631:
URL: https://github.com/apache/superset/issues/16631#issuecomment-915025244


   I consider issuing multiple queries (rather than one query) a feature, not a bug. While it would be preferable from a perf perspective to pull in just one set of data and then split it up in the backend or frontend, this is a slippery slope of introducing database functionality into Superset. We did quite a bit of iterating on this with @zhaoyongjie and I believe we both felt fairly strongly that we should keep each time comparison query separate.
   
   Having said that, we can definitely look into issuing those queries simultaneously, and I think it's a great idea. However, I believe we first need to bump SqlAlchemy to 1.4, as I believe the new async functionality is required for this: https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html So some fairly extensive refactoring would probably be required. But it's an exciting feature that I'd personally be interested in adding support for.
   
   Ping @john-bodley 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org