You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2022/04/06 09:16:44 UTC

[GitHub] [superset] exemplary-citizen opened a new pull request, #19554: feat(forecast): Forecasting plugin system

exemplary-citizen opened a new pull request, #19554:
URL: https://github.com/apache/superset/pull/19554

   ### SUMMARY
   Allows users to use any ML model for forecasting as long as the package is installed. Also provides a numpy based forecasting solution for users that do not want/have prophet installed
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   prophet
   <img width="1140" alt="prophet" src="https://user-images.githubusercontent.com/51683050/161939628-dc2e5925-75c8-4681-9ec2-bb2ea0c0dc59.png">
   
   numpy
   <img width="1143" alt="numpy" src="https://user-images.githubusercontent.com/51683050/161939726-f7b15c7c-885c-4927-bd26-7b2bf8eed78e.png">
   
   random example with xgboost
   <img width="1142" alt="xgboost" src="https://user-images.githubusercontent.com/51683050/161939829-1256d1aa-c9dd-4962-96db-ce01827361c5.png">
   
   new model dropdown in predictive analytics tab
   <img width="319" alt="Screen Shot 2022-04-06 at 2 05 26 AM" src="https://user-images.githubusercontent.com/51683050/161940131-6ef3e97f-4317-4ce6-bd9d-4c3c81ed4b63.png">
   
   
   ### TESTING INSTRUCTIONS
   Added several unit and integration tests
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [x] Changes UI
   - [ ] Includes DB Migration (follow approval process in [SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [x] Introduces new feature or API
   - [ ] Removes existing feature or API
   
   cc: @villebro @bkyryliuk


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] codecov[bot] commented on pull request #19554: feat(forecast): Forecasting plugin system

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #19554:
URL: https://github.com/apache/superset/pull/19554#issuecomment-1099978202

   # [Codecov](https://codecov.io/gh/apache/superset/pull/19554?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#19554](https://codecov.io/gh/apache/superset/pull/19554?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (29b64b3) into [master](https://codecov.io/gh/apache/superset/commit/94075983f8abfcc7749cede5af9e24d2a9f1abe0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9407598) will **decrease** coverage by `12.77%`.
   > The diff coverage is `70.61%`.
   
   > :exclamation: Current head 29b64b3 differs from pull request most recent head dfdd6c7. Consider uploading reports for the commit dfdd6c7 to get more accurate results
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #19554       +/-   ##
   ===========================================
   - Coverage   66.51%   53.73%   -12.78%     
   ===========================================
     Files        1686     1689        +3     
     Lines       64591    64820      +229     
     Branches     6636     6636               
   ===========================================
   - Hits        42961    34834     -8127     
   - Misses      19931    28287     +8356     
     Partials     1699     1699               
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hive | `?` | |
   | mysql | `?` | |
   | postgres | `?` | |
   | presto | `52.39% <31.80%> (-0.15%)` | :arrow_down: |
   | python | `56.38% <70.49%> (-26.05%)` | :arrow_down: |
   | sqlite | `?` | |
   | unit | `47.94% <69.73%> (+0.18%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/superset/pull/19554?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...i-chart-controls/src/sections/forecastInterval.tsx](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQtZnJvbnRlbmQvcGFja2FnZXMvc3VwZXJzZXQtdWktY2hhcnQtY29udHJvbHMvc3JjL3NlY3Rpb25zL2ZvcmVjYXN0SW50ZXJ2YWwudHN4) | `100.00% <ø> (ø)` | |
   | [.../plugin-chart-echarts/src/Timeseries/buildQuery.ts](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQtZnJvbnRlbmQvcGx1Z2lucy9wbHVnaW4tY2hhcnQtZWNoYXJ0cy9zcmMvVGltZXNlcmllcy9idWlsZFF1ZXJ5LnRz) | `66.66% <ø> (ø)` | |
   | [...ugins/plugin-chart-echarts/src/Timeseries/types.ts](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQtZnJvbnRlbmQvcGx1Z2lucy9wbHVnaW4tY2hhcnQtZWNoYXJ0cy9zcmMvVGltZXNlcmllcy90eXBlcy50cw==) | `100.00% <ø> (ø)` | |
   | [superset/forecasts/base.py](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZm9yZWNhc3RzL2Jhc2UucHk=) | `52.30% <52.30%> (ø)` | |
   | [superset/forecasts/mixins.py](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZm9yZWNhc3RzL21peGlucy5weQ==) | `84.50% <84.50%> (ø)` | |
   | [superset/forecasts/\_\_init\_\_.py](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZm9yZWNhc3RzL19faW5pdF9fLnB5) | `91.30% <91.30%> (ø)` | |
   | [superset/utils/pandas\_postprocessing/forecast.py](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvdXRpbHMvcGFuZGFzX3Bvc3Rwcm9jZXNzaW5nL2ZvcmVjYXN0LnB5) | `84.48% <93.10%> (ø)` | |
   | [...superset-ui-core/src/query/types/PostProcessing.ts](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQtZnJvbnRlbmQvcGFja2FnZXMvc3VwZXJzZXQtdWktY29yZS9zcmMvcXVlcnkvdHlwZXMvUG9zdFByb2Nlc3NpbmcudHM=) | `100.00% <100.00%> (ø)` | |
   | [superset/charts/schemas.py](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvY2hhcnRzL3NjaGVtYXMucHk=) | `99.35% <100.00%> (+<0.01%)` | :arrow_up: |
   | [superset/config.py](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvY29uZmlnLnB5) | `91.11% <100.00%> (-0.31%)` | :arrow_down: |
   | ... and [286 more](https://codecov.io/gh/apache/superset/pull/19554/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/superset/pull/19554?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/superset/pull/19554?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9407598...dfdd6c7](https://codecov.io/gh/apache/superset/pull/19554?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] zhaoyongjie commented on a diff in pull request #19554: feat(forecast): Forecasting plugin system

Posted by GitBox <gi...@apache.org>.
zhaoyongjie commented on code in PR #19554:
URL: https://github.com/apache/superset/pull/19554#discussion_r843724894


##########
tests/unit_tests/pandas_postprocessing/test_aggregate.py:
##########
@@ -14,12 +14,15 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-from superset.utils.pandas_postprocessing import aggregate
+from flask.ctx import AppContext
+
 from tests.unit_tests.fixtures.dataframes import categories_df
 from tests.unit_tests.pandas_postprocessing.utils import series_to_list
 
 
-def test_aggregate():
+def test_aggregate(app_context: AppContext) -> None:
+    from superset.utils.pandas_postprocessing import aggregate
+

Review Comment:
   The `Flask Application Context` seems that does not relate to `Operators`. We can add an integration test for `Operators` and `QueryObject`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a diff in pull request #19554: feat(forecast): Forecasting plugin system

Posted by GitBox <gi...@apache.org>.
villebro commented on code in PR #19554:
URL: https://github.com/apache/superset/pull/19554#discussion_r882403129


##########
superset/utils/pandas_postprocessing/forecast.py:
##########
@@ -133,14 +151,19 @@ def prophet(  # pylint: disable=too-many-arguments
     if len(df.columns) < 2:
         raise InvalidPostProcessingError(_("DataFrame include at least one series"))
 
-    target_df = DataFrame()
-    for column in [column for column in df.columns if column != index]:
-        fit_df = _prophet_fit_and_predict(
-            df=df[[index, column]].rename(columns={index: "ds", column: "y"}),
+    target_df = pd.DataFrame()
+    ds = df[DTTM_ALIAS]
+    for column in [column for column in df.columns if column != DTTM_ALIAS]:

Review Comment:
   We should reference the index rather than the fixed `DTTM_ALIAS` (this is relevant when using `GENERIC_CHART_AXES`)
   ```suggestion
       ds = df[index]
       for column in [column for column in df.columns if column != index]:
   ```



##########
superset/utils/pandas_postprocessing/forecast.py:
##########
@@ -133,14 +151,19 @@ def prophet(  # pylint: disable=too-many-arguments
     if len(df.columns) < 2:
         raise InvalidPostProcessingError(_("DataFrame include at least one series"))
 
-    target_df = DataFrame()
-    for column in [column for column in df.columns if column != index]:
-        fit_df = _prophet_fit_and_predict(
-            df=df[[index, column]].rename(columns={index: "ds", column: "y"}),
+    target_df = pd.DataFrame()
+    ds = df[DTTM_ALIAS]
+    for column in [column for column in df.columns if column != DTTM_ALIAS]:
+        model = forecasts.get_model(
+            model_name=model_name,
             confidence_interval=confidence_interval,
-            yearly_seasonality=_prophet_parse_seasonality(yearly_seasonality),
-            weekly_seasonality=_prophet_parse_seasonality(weekly_seasonality),
-            daily_seasonality=_prophet_parse_seasonality(daily_seasonality),
+            yearly_seasonality=_parse_seasonality(yearly_seasonality, "yearly", ds),
+            monthly_seasonality=_parse_seasonality(monthly_seasonality, "monthly", ds),
+            weekly_seasonality=_parse_seasonality(weekly_seasonality, "weekly", ds),
+            daily_seasonality=_parse_seasonality(daily_seasonality, "daily", ds),
+        )
+        fit_df = model.fit_transform(  # type: ignore
+            df=df[[DTTM_ALIAS, column]].rename(columns={DTTM_ALIAS: "ds", column: "y"}),

Review Comment:
   same here
   ```suggestion
               df=df[[index, column]].rename(columns={index: "ds", column: "y"}),
   ```



##########
superset/utils/pandas_postprocessing/forecast.py:
##########
@@ -157,4 +180,4 @@ def prophet(  # pylint: disable=too-many-arguments
             for new_column in new_columns:
                 target_df = target_df.assign(**{new_column: fit_df[new_column]})
     target_df.reset_index(level=0, inplace=True)
-    return target_df.rename(columns={"ds": index})
+    return target_df.rename(columns={"ds": DTTM_ALIAS})

Review Comment:
   and here:
   ```suggestion
       return target_df.rename(columns={"ds": index})
   ```



##########
superset-frontend/packages/superset-ui-core/src/query/types/PostProcessing.ts:
##########
@@ -121,19 +121,20 @@ interface _PostProcessingPivot {
 }
 export type PostProcessingPivot = _PostProcessingPivot | DefaultPostProcessing;
 
-interface _PostProcessingProphet {
-  operation: 'prophet';
+interface _PostProcessingForecast {
+  operation: 'forecast';
   options: {
     time_grain: TimeGranularity;
     periods: number;
     confidence_interval: number;
     yearly_seasonality?: boolean | number;
     weekly_seasonality?: boolean | number;
     daily_seasonality?: boolean | number;
+    model_name: string;
   };
 }

Review Comment:
   As renaming the post processing op is a breaking change, I think we should migrate existing the `query_context` column in the `slices` table as follows:
   - update operation from `prophet` to `forecast` where there exists `"operation": "prophet"`
   - while we're at it, maybe also explicitly set `model_name` to `prophet.Prophet`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a diff in pull request #19554: feat(forecast): Forecasting plugin system

Posted by GitBox <gi...@apache.org>.
villebro commented on code in PR #19554:
URL: https://github.com/apache/superset/pull/19554#discussion_r882409304


##########
superset-frontend/packages/superset-ui-core/src/query/types/PostProcessing.ts:
##########
@@ -121,19 +121,20 @@ interface _PostProcessingPivot {
 }
 export type PostProcessingPivot = _PostProcessingPivot | DefaultPostProcessing;
 
-interface _PostProcessingProphet {
-  operation: 'prophet';
+interface _PostProcessingForecast {
+  operation: 'forecast';
   options: {
     time_grain: TimeGranularity;
     periods: number;
     confidence_interval: number;
     yearly_seasonality?: boolean | number;
     weekly_seasonality?: boolean | number;
     daily_seasonality?: boolean | number;
+    model_name: string;
   };
 }

Review Comment:
   As renaming the post processing op is a breaking change, I think we should migrate existing `query_context` column data in the `slices` table as follows:
   - update operation from `prophet` to `forecast` where there exists `"operation": "prophet"`
   - while we're at it, maybe also explicitly set `model_name` to `prophet.Prophet`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [PR] feat(forecast): Forecasting plugin system [superset]

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on PR #19554:
URL: https://github.com/apache/superset/pull/19554#issuecomment-1920536185

   @exemplary-citizen this is a great PR, but need s a big rebase if we're to pick it up and get it in. Let us know if you still have interest in pursuing this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [PR] feat(forecast): Forecasting plugin system [superset]

Posted by "villebro (via GitHub)" <gi...@apache.org>.
villebro commented on PR #19554:
URL: https://github.com/apache/superset/pull/19554#issuecomment-1920557057

   Ping @bkyryliuk as I believe you know @exemplary-citizen . I think this PR is a great step in the right direction, so I may be able to pick up this work if this will otherwise be left stale.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org