You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2021/02/15 18:48:48 UTC

[GitHub] [superset] villebro opened a new pull request #13138: fix(chart-data-api): support numeric temporal columns

villebro opened a new pull request #13138:
URL: https://github.com/apache/superset/pull/13138


   ### SUMMARY
   Some database connectors (et least the native Pinot connector) return a numeric value (either as second or millisecord epoch) instead of a regular timestamp. This was working correctly on `viz.py`, but support was lacking on `/api/v1/chart/data`. This centralizes the logic in a util, simplifies the code and adds tests that were previously missing.
   
   ### TEST PLAN
   CI + new tests
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Changes UI
   - [ ] Requires DB Migration.
   - [ ] Confirm DB Migration upgrade and downgrade tested.
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] robdiciuccio commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r578023801



##########
File path: superset/utils/core.py
##########
@@ -1579,3 +1580,34 @@ def format_list(items: Sequence[str], sep: str = ", ", quote: str = '"') -> str:
 def find_duplicates(items: Iterable[InputType]) -> List[InputType]:
     """Find duplicate items in an iterable."""
     return [item for item, count in collections.Counter(items).items() if count > 1]
+
+
+def normalize_dttm_col(
+    df: pd.DataFrame,
+    timestamp_format: Optional[str],
+    offset: int,
+    time_shift: Optional[timedelta],
+) -> pd.DataFrame:
+    if DTTM_ALIAS not in df.columns:
+        return df
+    df = df.copy()

Review comment:
       Thanks, @villebro. [DataFrame.copy()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.copy.html) is deep by default.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] codecov-io edited a comment on pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #13138:
URL: https://github.com/apache/superset/pull/13138#issuecomment-779444074


   # [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=h1) Report
   > Merging [#13138](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=desc) (5ba5a73) into [master](https://codecov.io/gh/apache/superset/commit/2ce79823dfad61bce6196fcacd56a844f44818c0?el=desc) (2ce7982) will **increase** coverage by `13.73%`.
   > The diff coverage is `42.36%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/superset/pull/13138/graphs/tree.svg?width=650&height=150&src=pr&token=KsB0fHcx6l)](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #13138       +/-   ##
   ===========================================
   + Coverage   53.06%   66.79%   +13.73%     
   ===========================================
     Files         489      492        +3     
     Lines       17314    29026    +11712     
     Branches     4482        0     -4482     
   ===========================================
   + Hits         9187    19389    +10202     
   - Misses       8127     9637     +1510     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | cypress | `?` | |
   | python | `66.79% <42.36%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [superset/examples/birth\_names.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZXhhbXBsZXMvYmlydGhfbmFtZXMucHk=) | `73.19% <ø> (ø)` | |
   | [...43f2fdb\_add\_granularity\_to\_charts\_where\_missing.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8wNzBjMDQzZjJmZGJfYWRkX2dyYW51bGFyaXR5X3RvX2NoYXJ0c193aGVyZV9taXNzaW5nLnB5) | `0.00% <0.00%> (ø)` | |
   | [...s/260bf0649a77\_migrate\_x\_dateunit\_in\_time\_range.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8yNjBiZjA2NDlhNzdfbWlncmF0ZV94X2RhdGV1bml0X2luX3RpbWVfcmFuZ2UucHk=) | `0.00% <0.00%> (ø)` | |
   | [...ons/versions/41ce8799acc3\_rename\_pie\_label\_type.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy80MWNlODc5OWFjYzNfcmVuYW1lX3BpZV9sYWJlbF90eXBlLnB5) | `0.00% <0.00%> (ø)` | |
   | [superset/connectors/sqla/views.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY29ubmVjdG9ycy9zcWxhL3ZpZXdzLnB5) | `62.43% <4.34%> (ø)` | |
   | [superset/views/datasource.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdmlld3MvZGF0YXNvdXJjZS5weQ==) | `88.70% <16.66%> (ø)` | |
   | [superset/viz.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdml6LnB5) | `58.61% <27.27%> (ø)` | |
   | [superset/charts/commands/exceptions.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY2hhcnRzL2NvbW1hbmRzL2V4Y2VwdGlvbnMucHk=) | `92.85% <77.77%> (ø)` | |
   | [superset/utils/core.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdXRpbHMvY29yZS5weQ==) | `88.26% <87.50%> (ø)` | |
   | [superset/db\_engine\_specs/trino.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZGJfZW5naW5lX3NwZWNzL3RyaW5vLnB5) | `90.47% <90.47%> (ø)` | |
   | ... and [936 more](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=footer). Last update [2dbe92b...5ba5a73](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro merged pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
villebro merged pull request #13138:
URL: https://github.com/apache/superset/pull/13138


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
villebro commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r576593139



##########
File path: tests/utils_tests.py
##########
@@ -1131,3 +1133,20 @@ def test_extract_dataframe_dtypes(self):
 
         df = pd.DataFrame(data={col[0]: col[2] for col in cols})
         assert extract_dataframe_dtypes(df) == [col[1] for col in cols]
+
+    def test_normalize_dttm_col(self):
+        ts = pd.Timestamp(2021, 2, 15, 19, 0, 0, 0)
+        df = pd.DataFrame([{"__timestamp": ts, "a": 1}])
+
+        # test non-numeric format
+        assert normalize_dttm_col(df, None, 0, None)[DTTM_ALIAS][0] == ts
+        assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts
+        assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+        # test numeric epoch_s format
+        df = pd.DataFrame([{"__timestamp": ts.timestamp(), "a": 1}])
+        assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+        # test numeric epoch_ms format
+        df = pd.DataFrame([{"__timestamp": ts.timestamp() * 1000, "a": 1}])
+        assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts

Review comment:
       Thanks, I'll add these tests (they were outside the scope of this fix but good to have) 👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] robdiciuccio commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r577981468



##########
File path: superset/utils/core.py
##########
@@ -1579,3 +1580,34 @@ def format_list(items: Sequence[str], sep: str = ", ", quote: str = '"') -> str:
 def find_duplicates(items: Iterable[InputType]) -> List[InputType]:
     """Find duplicate items in an iterable."""
     return [item for item, count in collections.Counter(items).items() if count > 1]
+
+
+def normalize_dttm_col(
+    df: pd.DataFrame,
+    timestamp_format: Optional[str],
+    offset: int,
+    time_shift: Optional[timedelta],
+) -> pd.DataFrame:
+    if DTTM_ALIAS not in df.columns:
+        return df
+    df = df.copy()

Review comment:
       Is the `copy` operation necessary here? This could be very expensive on memory for large result sets.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] amitmiran137 commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
amitmiran137 commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r576586650



##########
File path: tests/utils_tests.py
##########
@@ -1131,3 +1133,20 @@ def test_extract_dataframe_dtypes(self):
 
         df = pd.DataFrame(data={col[0]: col[2] for col in cols})
         assert extract_dataframe_dtypes(df) == [col[1] for col in cols]
+
+    def test_normalize_dttm_col(self):
+        ts = pd.Timestamp(2021, 2, 15, 19, 0, 0, 0)
+        df = pd.DataFrame([{"__timestamp": ts, "a": 1}])
+
+        # test non-numeric format
+        assert normalize_dttm_col(df, None, 0, None)[DTTM_ALIAS][0] == ts
+        assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts
+        assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+        # test numeric epoch_s format
+        df = pd.DataFrame([{"__timestamp": ts.timestamp(), "a": 1}])
+        assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+        # test numeric epoch_ms format
+        df = pd.DataFrame([{"__timestamp": ts.timestamp() * 1000, "a": 1}])
+        assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts

Review comment:
       I'm missing 2 use-cases here:
   1. offeset
   2. timeshift
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
villebro commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r578007300



##########
File path: superset/utils/core.py
##########
@@ -1579,3 +1580,34 @@ def format_list(items: Sequence[str], sep: str = ", ", quote: str = '"') -> str:
 def find_duplicates(items: Iterable[InputType]) -> List[InputType]:
     """Find duplicate items in an iterable."""
     return [item for item, count in collections.Counter(items).items() if count > 1]
+
+
+def normalize_dttm_col(
+    df: pd.DataFrame,
+    timestamp_format: Optional[str],
+    offset: int,
+    time_shift: Optional[timedelta],
+) -> pd.DataFrame:
+    if DTTM_ALIAS not in df.columns:
+        return df
+    df = df.copy()

Review comment:
       Good point @robdiciuccio . I'm a bit allergic to functions that mutate their inputs (even Pandas is moving away from this pattern), but this could in fact have a noticeable performance hit. I'm going to be touching this code in the coming days, so I'll make sure to address this then (either confirm that the copy is shallow enough to not cause a perf hit or remove the copy operation).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] codecov-io commented on pull request #13138: fix(chart-data-api): support numeric temporal columns

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #13138:
URL: https://github.com/apache/superset/pull/13138#issuecomment-779444074


   # [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=h1) Report
   > Merging [#13138](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=desc) (3def36b) into [master](https://codecov.io/gh/apache/superset/commit/2ce79823dfad61bce6196fcacd56a844f44818c0?el=desc) (2ce7982) will **increase** coverage by `13.76%`.
   > The diff coverage is `42.75%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/superset/pull/13138/graphs/tree.svg?width=650&height=150&src=pr&token=KsB0fHcx6l)](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #13138       +/-   ##
   ===========================================
   + Coverage   53.06%   66.82%   +13.76%     
   ===========================================
     Files         489      492        +3     
     Lines       17314    29029    +11715     
     Branches     4482        0     -4482     
   ===========================================
   + Hits         9187    19399    +10212     
   - Misses       8127     9630     +1503     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | cypress | `?` | |
   | python | `66.82% <42.75%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [superset/examples/birth\_names.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZXhhbXBsZXMvYmlydGhfbmFtZXMucHk=) | `73.19% <ø> (ø)` | |
   | [...43f2fdb\_add\_granularity\_to\_charts\_where\_missing.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8wNzBjMDQzZjJmZGJfYWRkX2dyYW51bGFyaXR5X3RvX2NoYXJ0c193aGVyZV9taXNzaW5nLnB5) | `0.00% <0.00%> (ø)` | |
   | [...s/260bf0649a77\_migrate\_x\_dateunit\_in\_time\_range.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8yNjBiZjA2NDlhNzdfbWlncmF0ZV94X2RhdGV1bml0X2luX3RpbWVfcmFuZ2UucHk=) | `0.00% <0.00%> (ø)` | |
   | [...ons/versions/41ce8799acc3\_rename\_pie\_label\_type.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy80MWNlODc5OWFjYzNfcmVuYW1lX3BpZV9sYWJlbF90eXBlLnB5) | `0.00% <0.00%> (ø)` | |
   | [superset/connectors/sqla/views.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY29ubmVjdG9ycy9zcWxhL3ZpZXdzLnB5) | `62.43% <4.34%> (ø)` | |
   | [superset/views/datasource.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdmlld3MvZGF0YXNvdXJjZS5weQ==) | `88.70% <16.66%> (ø)` | |
   | [superset/viz.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdml6LnB5) | `58.63% <33.33%> (ø)` | |
   | [superset/charts/commands/exceptions.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY2hhcnRzL2NvbW1hbmRzL2V4Y2VwdGlvbnMucHk=) | `92.85% <77.77%> (ø)` | |
   | [superset/utils/core.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdXRpbHMvY29yZS5weQ==) | `88.26% <87.50%> (ø)` | |
   | [superset/db\_engine\_specs/trino.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZGJfZW5naW5lX3NwZWNzL3RyaW5vLnB5) | `90.47% <90.47%> (ø)` | |
   | ... and [936 more](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=footer). Last update [2dbe92b...5ba5a73](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org