You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@superset.apache.org by ru...@apache.org on 2023/10/11 17:30:31 UTC

[superset] branch master updated: docs: Add timezone information (#19056)

This is an automated email from the ASF dual-hosted git repository.

rusackas pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/superset.git


The following commit(s) were added to refs/heads/master by this push:
     new 1f20214244 docs: Add timezone information (#19056)
1f20214244 is described below

commit 1f20214244f1916a389adf4c924a5f84ff1d8398
Author: John Bodley <45...@users.noreply.github.com>
AuthorDate: Wed Oct 11 10:30:23 2023 -0700

    docs: Add timezone information (#19056)
    
    Co-authored-by: Evan Rusackas <ev...@preset.io>
---
 docs/docs/miscellaneous/timezones.mdx | 50 +++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/docs/docs/miscellaneous/timezones.mdx b/docs/docs/miscellaneous/timezones.mdx
new file mode 100644
index 0000000000..cd091a33a5
--- /dev/null
+++ b/docs/docs/miscellaneous/timezones.mdx
@@ -0,0 +1,50 @@
+---
+title: Timezones
+hide_title: true
+sidebar_position: 1
+version: 1
+---
+
+## Timezones
+
+There are four distinct timezone components which relate to Apache Superset,
+
+1. The timezone that the underlying data is encoded in.
+2. The timezone of the database engine.
+3. The timezone of the Apache Superset backend.
+4. The timezone of the Apache Superset client.
+
+where if a temporal field (`DATETIME`, `TIME`, `TIMESTAMP`, etc.) does not explicitly define a timezone it defaults to the underlying timezone of the component.
+
+To help make the problem somewhat tractable—given that Apache Superset has no control on either how the data is ingested (1) or the timezone of the client (4)—from a consistency standpoint it is highly recommended that both (2) and (3) are configured to use the same timezone with a strong preference given to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) to ensure temporal fields without an explicit timestamp are not incorrectly coerced into the wrong timezone. Actually  [...]
+
+To strive for data consistency (regardless of the timezone of the client) the Apache Superset backend tries to ensure that any timestamp sent to the client has an explicit (or semi-explicit as in the case with [Epoch time](https://en.wikipedia.org/wiki/Unix_time) which is always in reference to UTC) timezone encoded within.
+
+The challenge however lies with the slew of [database engines](/docs/databases/installing-database-drivers#install-database-drivers) which Apache Superset supports and various inconsistencies between their [Python Database API (DB-API)](https://www.python.org/dev/peps/pep-0249/) implementations combined with the fact that we use [Pandas](https://pandas.pydata.org/) to read SQL into a DataFrame prior to serializing to JSON. Regrettably Pandas ignores the DB-API [type_code](https://www.pyt [...]
+
+For example the following is a comparison of MySQL and Presto,
+
+```
+import pandas as pd
+from sqlalchemy import create_engine
+
+pd.read_sql_query(
+    sql="SELECT TIMESTAMP('2022-01-01 00:00:00') AS ts",
+    con=create_engine("mysql://root@localhost:3360"),
+).to_json()
+
+pd.read_sql_query(
+    sql="SELECT TIMESTAMP '2022-01-01 00:00:00' AS ts",
+    con=create_engine("presto://localhost:8080"),
+).to_json()
+```
+
+which outputs `{"ts":{"0":1640995200000}}` (which infers the UTC timezone per the Epoch time definition) and `{"ts":{"0":"2022-01-01 00:00:00.000"}}` (without an explicit timezone) respectively and thus are treated differently in JavaScript:
+
+```
+new Date(1640995200000)
+> Sat Jan 01 2022 13:00:00 GMT+1300 (New Zealand Daylight Time)
+
+new Date("2022-01-01 00:00:00.000")
+> Sat Jan 01 2022 00:00:00 GMT+1300 (New Zealand Daylight Time)
+```