You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2018/11/26 12:41:59 UTC

[GitHub] frankfarrell opened a new issue #6434: Support per table time grain functions to improve performance

frankfarrell opened a new issue #6434: Support per table time grain functions to improve performance
URL: https://github.com/apache/incubator-superset/issues/6434
 
 
   Currently, if you set the time grain to P1D it does a DATE_PART function call on the timestamp function. For big tables grouping by an on-demand function like this is a big performance hit. In a typical star schema, tables might have these time granularities as columns already. Eg a column date_dimension, or date_hour_dimension. Grouping on a column like that, especially if it is indexed will give a massive performance improvment. 
   
   In particular, for Redshift if the date_part is part of the sort key, queries will run much faster than grouping by DATE_PART. 
   
   I'm proposing a way to specify the column/function for time granularity that overrides the default function as specified here:
   https://github.com/apache/incubator-superset/blob/74f0817bf0e3469c27df0f49a6b29fa0a71e3c9b/superset/db_engine_specs.py#L403

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org