You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/05/22 23:43:39 UTC

[GitHub] [incubator-superset] john-bodley edited a comment on issue #6360: [SIP-15] Transparent and Consistent Time Intervals

john-bodley edited a comment on issue #6360: [SIP-15] Transparent and Consistent Time Intervals
URL: https://github.com/apache/incubator-superset/issues/6360#issuecomment-495014185
 
 
   ### TL;DR
   
   After researching other BI tools and surveying content creators we propose to address SIP-15 via: 
   
   1. Add support for both dates and timestamps.
   2. Make _all_ temporal intervals be `[start, end)`.
   3. Make _all_ temporal comparison use native date/timestamp as opposed to lexicographical string comparison.
   
   #### 1. Dates and timestamps
   
   Currently Superset treats every temporary field as as timestamp regardless if the underlying data is a time, date, quarter, year etc. This adds some confusion when specify relative times as it's not apparent whether it is in reference to now or today.
   
   Given the most frequent temporal grains are dates and timestamps (date-time) there seems merit in calling these out _explicitly_ which is what Tableau, Looker, Microstrategy do. We suggest the following changes to the time section:
   
   ##### Dates
   
   - Make the smallest time grain "day".
   - Remove the HH:MM:SS from the time picker.
   - Make the relative temporal point be _today_ for all windows.†
   
   ##### Timestamps
   
   - Make the relative temporal point be _now_ for all windows.†
   
   † Note the relative temporal points are up for debate as there seems to be some iteration on this per [this](https://github.com/apache/incubator-superset/issues/6360#issuecomment-484994379) comment. Additionally there may be merit in trying to push down the _now_, _today_ logic to the underlying database so they can take advantage of any query caching mechanism. 
   
   #### 2. Temporal intervals
   
   Whilst researching other BI tools it was somewhat difficult to determine via the lack of documentation how they handled intervals. To the best of my knowledge it seems that Looker uses `[start, end)` whereas Tableau uses `[start, end]`. The SQL [`BETWEEN`](https://www.w3schools.com/sql/sql_between.asp) operator also works as `[start, end]`. 
   
   People generally think about dates as _inclusive_ start/end, i.e., if one were to read a sign "Sale ends Saturday" one would expect that the sale would be through close of business on Saturday as opposed to Friday. Times however are different and are often thought as having an _exclusive_ end, i.e., a meeting scheduled from 9 - 10 am really is `[9:00, 10:00)`.
   
   Given that we foresee time becoming more prevalent I believe it would be a mistake to make date intervals `[start, end]` and time intervals `[start, end)` as this would be quite confusing to the user and thus standardizing on one format is vital. I believe that `[start, end)` is the _best_ format because: 
   
   - It contains no gaps.
   - One does not need to know about the precision of the device, i.e., whether the clock has second, millisecond, or microsecond accuracy.
   
   For more detail see [this](https://stackoverflow.com/a/9810359) post. Returning to the inclusive dates this can actually be problematic for leap years, i.e., `[02-01, 02-28]` may not capture the entire month of February, i.e., there is a gap whereas `[02-01, 03-01)` is guaranteed to fully encapsulate the month. 
   
   Note this decision is not in agreement with the majority of the survey respondents, however it is worth noting that there was support for the `[start, end)` intervals when considering both dates and timestamps.
   
   We suggest the following changes:
   
   - Explicit call out in the UI how the temporal intervals work, i.e., _inclusive_ of the start, and exclusive of the _end_. 
   
   #### 3. Temporal Comparisons
   
   As mentioned in the original comment currently for SQL connectors we use a lexicographical string comparison for temporal fields which means that `[start, end]` actually behaves like `(start, end]` for dates. To remedy this _all_ temporal comparisons will cast both the _left_ and _right_ entities to either a date or timestamp depending on the underlying type. 
   
   ### Implementation 
   
   Changing the SQL connector from `[start, end]` (which again often behaves like `(start, end]`) to `[start, end)` is non-trivial, i.e., a simply migration is not possible given it's impossible to determine whether the producer knew about the discrepancies (and adjusted their time boundaries accordingly). Per the survey results only 21.7% of respondents were aware of the issue. Rather than performing a migration to address the issue we propose the following: 
   
   -  Apply the changes proposed above which are behind a feature flag (which is enabled by default).†
   - Provide a toast/banner in the UI which informs the user that temporal interval has changed. Only the producer(s) of the chart can dismiss the toast/banner once they have either i) confirmed that the chart is correct, or ii) modified the interval accordingly. 
   - Provide a link which will render the chart in its prior state providing producers will a visual diff. 
   - Provide a mechanism for disabling the toast/banner and visual diff after a specific date.
   
   † This results in additional code logic but it seems like the complexity is worthwhile given the impact of the change and the benefit in being able to see a visual diff of the change. Note this change will mostly impact charts where the time isn't explicitly shown (bar, pie, etc.). 
   
   cc: @betodealmeida @graceguo-supercat @michellethomas @mistercrunch @soboko @vylc @xtinec

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org