You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2021/09/17 11:33:46 UTC

[GitHub] [superset] aniaan opened a new issue #16726: Elasticsearch time_zone setting does not work for cast datetime expressions

aniaan opened a new issue #16726:
URL: https://github.com/apache/superset/issues/16726


   My time zone is CST, not UTC. Based on some best practices of ES, when we store data, we convert CST time to UTC time for storage, and the corresponding conversion is done by time zone when querying.
   When I was using superset, I found two problems
   
   1. elasticsearch-dbapi does not support passing the `time_zone` parameter, I tried to fix it and submitted a PR https://github.com/preset-io/elasticsearch-dbapi/pull/69,  currently waiting for a review, but I can fix it temporarily by pip install <my_github_repo>. It's equivalent to hitting a patch to circumvent the issue.
   2. Once the problem was solved, I found another problem, the time_zone I set didn't work for filtering `datetime type fields`, but it worked for `HISTOGRAM("@timestamp", INTERVAL 1 DAY)`, I found the reason by checking the ES-SQL documentation,  https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-search-api.html#sql-search-api-request-body
   
   > (Optional, string) ISO-8601 time zone ID for the search. **Several SQL date/time functions use this time zone**. Defaults to Z (UTC).
   
   **Several SQL date/time functions use this time zone** very important!
   
   superset on ES datatime field query, is the use of `CAST ('date' AS DATETIME)`, 
   
   https://github.com/apache/superset/blob/cc1c6c1bb60b6357348028209062772927fd39ca/superset/db_engine_specs/elasticsearch.py#L62-L65
   
    time_zone on `CAST` this way of conversion **does not work**, only date related functions work, to solve this problem, should use `DATETIME_PARSE ('date', '{pattern}')` to deal with, time_zone on this function works.
   
   Here are some examples of my steps to demonstrate this in kibana
   
   ```json
   
   PUT /library/_bulk?refresh
   {"index":{"_id": "Leviathan Wakes"}}
   {"name": "Leviathan Wakes", "author": "James S.A. Corey", "release_date": "2021-09-17T13:24:19Z", "page_count": 561}
   {"index":{"_id": "Hyperion"}}
   {"name": "Hyperion", "author": "Dan Simmons", "release_date": "2021-09-17T19:24:19Z", "page_count": 482}
   {"index":{"_id": "Dune"}}
   {"name": "Dune", "author": "Frank Herbert", "release_date": "2021-09-18T09:24:19Z", "page_count": 604}
   
   # UTC -> CST
   # 2021-09-17T13:24:19Z => 2021-09-17T21:24:19+08:00
   # 2021-09-17T19:24:19Z => 2021-09-18T03:24:19+08:00
   # 2021-09-18T09:24:19Z => 2021-09-18T17:24:19+08:00
   
   
   # should be return 2 ,but 1, **because time_zone does not work for CAST**
   GET /_sql?format=txt
   {
     "query": """
     SELECT * FROM "library" where release_date > CAST('2021-09-17T21:24:19' AS DATETIME)
     """,
     "time_zone": "Asia/Shanghai"
   }
   
       author     |     name      |  page_count   |        release_date         
   ---------------+---------------+---------------+-----------------------------
   Frank Herbert  |Dune           |604            |2021-09-18T17:24:19.000+08:00
   
   
   ## return 2, work!!
   
   GET /_sql?format=txt
   {
     "query": """
     SELECT * FROM "library" where release_date > DATETIME_PARSE('2021-09-17 21:24:19', 'yyyy-MM-dd HH:mm:ss')
     """,
     "time_zone": "Asia/Shanghai"
   }
   
       author     |     name      |  page_count   |        release_date         
   ---------------+---------------+---------------+-----------------------------
   Dan Simmons    |Hyperion       |482            |2021-09-18T03:24:19.000+08:00
   Frank Herbert  |Dune           |604            |2021-09-18T17:24:19.000+08:00
   
   ```
   
   The above example should have illustrated the problem, the essence is that CAST this way for the time zone settings do not take effect, this is important for non-UTC users, I think we need to solve it, at present I can think of is
   1. change to DATETIME_PARSE, but there is no such function before ES 7.8.
   2. superset database add global time_zone parameter, the datetime object for filtering should carry the corresponding time zone information, and the format should also carry the time zone information. This way I found by looking at the code that the scope is too wide and will affect other data sources.
   
   If we have to be compatible with users before ES7.8, a compromise solution is to create a separate db_engine_spec for ES7.8 and above, which can be implemented in `elasticsearch-dbapi` as a separate `sqlalchemy ESDialect`, the name of which may require some careful thought.
   
   new db_engine_spec
   ```python
   @classmethod
       def convert_dttm(cls, target_type: str, dttm: datetime) -> Optional[str]:
           if target_type.upper() == utils.TemporalType.DATETIME:
               return f"""DATETIME_PARSE('2019-10-13 00:08:00', 'yyyy-MM-dd HH:mm:ss')"""
           return None
   ```
   
   This is the solution I have thought of so far, what do you think?
   
   cc @dpgaspar 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] dpgaspar commented on issue #16726: Elasticsearch time_zone setting does not work for cast datetime expressions

Posted by GitBox <gi...@apache.org>.
dpgaspar commented on issue #16726:
URL: https://github.com/apache/superset/issues/16726#issuecomment-927668041


   I can, looking good, thank you @aniaan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] aniaan closed issue #16726: Elasticsearch time_zone setting does not work for cast datetime expressions

Posted by GitBox <gi...@apache.org>.
aniaan closed issue #16726:
URL: https://github.com/apache/superset/issues/16726


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] junlincc commented on issue #16726: Elasticsearch time_zone setting does not work for cast datetime expressions

Posted by GitBox <gi...@apache.org>.
junlincc commented on issue #16726:
URL: https://github.com/apache/superset/issues/16726#issuecomment-921995129


   thank you for the contribution! @aniaan 
   I'm not sure if you opened the PR in the correct repo? @dpgaspar can you confirm? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org