You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2021/02/15 18:48:48 UTC
[GitHub] [superset] villebro opened a new pull request #13138: fix(chart-data-api): support numeric temporal columns
villebro opened a new pull request #13138:
URL: https://github.com/apache/superset/pull/13138
### SUMMARY
Some database connectors (et least the native Pinot connector) return a numeric value (either as second or millisecord epoch) instead of a regular timestamp. This was working correctly on `viz.py`, but support was lacking on `/api/v1/chart/data`. This centralizes the logic in a util, simplifies the code and adds tests that were previously missing.
### TEST PLAN
CI + new tests
### ADDITIONAL INFORMATION
<!--- Check any relevant boxes with "x" -->
<!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
- [ ] Has associated issue:
- [ ] Changes UI
- [ ] Requires DB Migration.
- [ ] Confirm DB Migration upgrade and downgrade tested.
- [ ] Introduces new feature or API
- [ ] Removes existing feature or API
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] robdiciuccio commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r578023801
##########
File path: superset/utils/core.py
##########
@@ -1579,3 +1580,34 @@ def format_list(items: Sequence[str], sep: str = ", ", quote: str = '"') -> str:
def find_duplicates(items: Iterable[InputType]) -> List[InputType]:
"""Find duplicate items in an iterable."""
return [item for item, count in collections.Counter(items).items() if count > 1]
+
+
+def normalize_dttm_col(
+ df: pd.DataFrame,
+ timestamp_format: Optional[str],
+ offset: int,
+ time_shift: Optional[timedelta],
+) -> pd.DataFrame:
+ if DTTM_ALIAS not in df.columns:
+ return df
+ df = df.copy()
Review comment:
Thanks, @villebro. [DataFrame.copy()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.copy.html) is deep by default.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] codecov-io edited a comment on pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #13138:
URL: https://github.com/apache/superset/pull/13138#issuecomment-779444074
# [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=h1) Report
> Merging [#13138](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=desc) (5ba5a73) into [master](https://codecov.io/gh/apache/superset/commit/2ce79823dfad61bce6196fcacd56a844f44818c0?el=desc) (2ce7982) will **increase** coverage by `13.73%`.
> The diff coverage is `42.36%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/superset/pull/13138/graphs/tree.svg?width=650&height=150&src=pr&token=KsB0fHcx6l)](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #13138 +/- ##
===========================================
+ Coverage 53.06% 66.79% +13.73%
===========================================
Files 489 492 +3
Lines 17314 29026 +11712
Branches 4482 0 -4482
===========================================
+ Hits 9187 19389 +10202
- Misses 8127 9637 +1510
```
| Flag | Coverage Δ | |
|---|---|---|
| cypress | `?` | |
| python | `66.79% <42.36%> (?)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [superset/examples/birth\_names.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZXhhbXBsZXMvYmlydGhfbmFtZXMucHk=) | `73.19% <ø> (ø)` | |
| [...43f2fdb\_add\_granularity\_to\_charts\_where\_missing.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8wNzBjMDQzZjJmZGJfYWRkX2dyYW51bGFyaXR5X3RvX2NoYXJ0c193aGVyZV9taXNzaW5nLnB5) | `0.00% <0.00%> (ø)` | |
| [...s/260bf0649a77\_migrate\_x\_dateunit\_in\_time\_range.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8yNjBiZjA2NDlhNzdfbWlncmF0ZV94X2RhdGV1bml0X2luX3RpbWVfcmFuZ2UucHk=) | `0.00% <0.00%> (ø)` | |
| [...ons/versions/41ce8799acc3\_rename\_pie\_label\_type.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy80MWNlODc5OWFjYzNfcmVuYW1lX3BpZV9sYWJlbF90eXBlLnB5) | `0.00% <0.00%> (ø)` | |
| [superset/connectors/sqla/views.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY29ubmVjdG9ycy9zcWxhL3ZpZXdzLnB5) | `62.43% <4.34%> (ø)` | |
| [superset/views/datasource.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdmlld3MvZGF0YXNvdXJjZS5weQ==) | `88.70% <16.66%> (ø)` | |
| [superset/viz.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdml6LnB5) | `58.61% <27.27%> (ø)` | |
| [superset/charts/commands/exceptions.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY2hhcnRzL2NvbW1hbmRzL2V4Y2VwdGlvbnMucHk=) | `92.85% <77.77%> (ø)` | |
| [superset/utils/core.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdXRpbHMvY29yZS5weQ==) | `88.26% <87.50%> (ø)` | |
| [superset/db\_engine\_specs/trino.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZGJfZW5naW5lX3NwZWNzL3RyaW5vLnB5) | `90.47% <90.47%> (ø)` | |
| ... and [936 more](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=footer). Last update [2dbe92b...5ba5a73](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] villebro merged pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
villebro merged pull request #13138:
URL: https://github.com/apache/superset/pull/13138
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] villebro commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
villebro commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r576593139
##########
File path: tests/utils_tests.py
##########
@@ -1131,3 +1133,20 @@ def test_extract_dataframe_dtypes(self):
df = pd.DataFrame(data={col[0]: col[2] for col in cols})
assert extract_dataframe_dtypes(df) == [col[1] for col in cols]
+
+ def test_normalize_dttm_col(self):
+ ts = pd.Timestamp(2021, 2, 15, 19, 0, 0, 0)
+ df = pd.DataFrame([{"__timestamp": ts, "a": 1}])
+
+ # test non-numeric format
+ assert normalize_dttm_col(df, None, 0, None)[DTTM_ALIAS][0] == ts
+ assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts
+ assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+ # test numeric epoch_s format
+ df = pd.DataFrame([{"__timestamp": ts.timestamp(), "a": 1}])
+ assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+ # test numeric epoch_ms format
+ df = pd.DataFrame([{"__timestamp": ts.timestamp() * 1000, "a": 1}])
+ assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts
Review comment:
Thanks, I'll add these tests (they were outside the scope of this fix but good to have) 👍
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] robdiciuccio commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r577981468
##########
File path: superset/utils/core.py
##########
@@ -1579,3 +1580,34 @@ def format_list(items: Sequence[str], sep: str = ", ", quote: str = '"') -> str:
def find_duplicates(items: Iterable[InputType]) -> List[InputType]:
"""Find duplicate items in an iterable."""
return [item for item, count in collections.Counter(items).items() if count > 1]
+
+
+def normalize_dttm_col(
+ df: pd.DataFrame,
+ timestamp_format: Optional[str],
+ offset: int,
+ time_shift: Optional[timedelta],
+) -> pd.DataFrame:
+ if DTTM_ALIAS not in df.columns:
+ return df
+ df = df.copy()
Review comment:
Is the `copy` operation necessary here? This could be very expensive on memory for large result sets.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] amitmiran137 commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
amitmiran137 commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r576586650
##########
File path: tests/utils_tests.py
##########
@@ -1131,3 +1133,20 @@ def test_extract_dataframe_dtypes(self):
df = pd.DataFrame(data={col[0]: col[2] for col in cols})
assert extract_dataframe_dtypes(df) == [col[1] for col in cols]
+
+ def test_normalize_dttm_col(self):
+ ts = pd.Timestamp(2021, 2, 15, 19, 0, 0, 0)
+ df = pd.DataFrame([{"__timestamp": ts, "a": 1}])
+
+ # test non-numeric format
+ assert normalize_dttm_col(df, None, 0, None)[DTTM_ALIAS][0] == ts
+ assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts
+ assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+ # test numeric epoch_s format
+ df = pd.DataFrame([{"__timestamp": ts.timestamp(), "a": 1}])
+ assert normalize_dttm_col(df, "epoch_s", 0, None)[DTTM_ALIAS][0] == ts
+
+ # test numeric epoch_ms format
+ df = pd.DataFrame([{"__timestamp": ts.timestamp() * 1000, "a": 1}])
+ assert normalize_dttm_col(df, "epoch_ms", 0, None)[DTTM_ALIAS][0] == ts
Review comment:
I'm missing 2 use-cases here:
1. offeset
2. timeshift
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] villebro commented on a change in pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
villebro commented on a change in pull request #13138:
URL: https://github.com/apache/superset/pull/13138#discussion_r578007300
##########
File path: superset/utils/core.py
##########
@@ -1579,3 +1580,34 @@ def format_list(items: Sequence[str], sep: str = ", ", quote: str = '"') -> str:
def find_duplicates(items: Iterable[InputType]) -> List[InputType]:
"""Find duplicate items in an iterable."""
return [item for item, count in collections.Counter(items).items() if count > 1]
+
+
+def normalize_dttm_col(
+ df: pd.DataFrame,
+ timestamp_format: Optional[str],
+ offset: int,
+ time_shift: Optional[timedelta],
+) -> pd.DataFrame:
+ if DTTM_ALIAS not in df.columns:
+ return df
+ df = df.copy()
Review comment:
Good point @robdiciuccio . I'm a bit allergic to functions that mutate their inputs (even Pandas is moving away from this pattern), but this could in fact have a noticeable performance hit. I'm going to be touching this code in the coming days, so I'll make sure to address this then (either confirm that the copy is shallow enough to not cause a perf hit or remove the copy operation).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org
[GitHub] [superset] codecov-io commented on pull request #13138: fix(chart-data-api): support numeric temporal columns
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #13138:
URL: https://github.com/apache/superset/pull/13138#issuecomment-779444074
# [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=h1) Report
> Merging [#13138](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=desc) (3def36b) into [master](https://codecov.io/gh/apache/superset/commit/2ce79823dfad61bce6196fcacd56a844f44818c0?el=desc) (2ce7982) will **increase** coverage by `13.76%`.
> The diff coverage is `42.75%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/superset/pull/13138/graphs/tree.svg?width=650&height=150&src=pr&token=KsB0fHcx6l)](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #13138 +/- ##
===========================================
+ Coverage 53.06% 66.82% +13.76%
===========================================
Files 489 492 +3
Lines 17314 29029 +11715
Branches 4482 0 -4482
===========================================
+ Hits 9187 19399 +10212
- Misses 8127 9630 +1503
```
| Flag | Coverage Δ | |
|---|---|---|
| cypress | `?` | |
| python | `66.82% <42.75%> (?)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [superset/examples/birth\_names.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZXhhbXBsZXMvYmlydGhfbmFtZXMucHk=) | `73.19% <ø> (ø)` | |
| [...43f2fdb\_add\_granularity\_to\_charts\_where\_missing.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8wNzBjMDQzZjJmZGJfYWRkX2dyYW51bGFyaXR5X3RvX2NoYXJ0c193aGVyZV9taXNzaW5nLnB5) | `0.00% <0.00%> (ø)` | |
| [...s/260bf0649a77\_migrate\_x\_dateunit\_in\_time\_range.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy8yNjBiZjA2NDlhNzdfbWlncmF0ZV94X2RhdGV1bml0X2luX3RpbWVfcmFuZ2UucHk=) | `0.00% <0.00%> (ø)` | |
| [...ons/versions/41ce8799acc3\_rename\_pie\_label\_type.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvbWlncmF0aW9ucy92ZXJzaW9ucy80MWNlODc5OWFjYzNfcmVuYW1lX3BpZV9sYWJlbF90eXBlLnB5) | `0.00% <0.00%> (ø)` | |
| [superset/connectors/sqla/views.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY29ubmVjdG9ycy9zcWxhL3ZpZXdzLnB5) | `62.43% <4.34%> (ø)` | |
| [superset/views/datasource.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdmlld3MvZGF0YXNvdXJjZS5weQ==) | `88.70% <16.66%> (ø)` | |
| [superset/viz.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdml6LnB5) | `58.63% <33.33%> (ø)` | |
| [superset/charts/commands/exceptions.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvY2hhcnRzL2NvbW1hbmRzL2V4Y2VwdGlvbnMucHk=) | `92.85% <77.77%> (ø)` | |
| [superset/utils/core.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvdXRpbHMvY29yZS5weQ==) | `88.26% <87.50%> (ø)` | |
| [superset/db\_engine\_specs/trino.py](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree#diff-c3VwZXJzZXQvZGJfZW5naW5lX3NwZWNzL3RyaW5vLnB5) | `90.47% <90.47%> (ø)` | |
| ... and [936 more](https://codecov.io/gh/apache/superset/pull/13138/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=footer). Last update [2dbe92b...5ba5a73](https://codecov.io/gh/apache/superset/pull/13138?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org