You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2020/07/15 16:59:59 UTC

[GitHub] [incubator-superset] villebro opened a new pull request #10324: feat: add optional prophet forecasting functionality to chart data api

villebro opened a new pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324


   ### SUMMARY
   This PR adds [Facebook Prophet](https://facebook.github.io/prophet/) as an optional dependency to enable time series forecasting in viz plugins. For each series (=column), it will add three new columns with the following suffix:
   - `__yhat`: forecast
   - `__yhat_lower`: lower confidence level
   - `__yhat_upper`: uper confidence level
   In addition, it extrapolates as many time grain steps into the future as have been specified in the `periods` parameter. For instance, 52 periods with weekly time grains will create a forecast of 52 weekly observations = 1 year into the future. The forecast along with confidence intervals will be made available for the whole time period, while the observations (without a suffix) will be null for the forecasted dates. For more details please refer to the [excellent tutorial](https://facebook.github.io/prophet/docs/quick_start.html#python-api) by the Prophet team.
   
   This treats each series as an individual time series model, hence will be computationally very expensive in the case of multiple columns. However, since the new chart data endpoint caches the end result, only the first calculation will be slow.
   
   I've tried to add as many different tests as possible (schema validation, post processing, e2e API test) to ensure functionality. The end-to-end tests require `fbprophet` to be installed, however I decided against adding it to `requirements-dev.txt`, as it will make CI much more sluggish. However, anyone interested in developing this feature further can install the dependency locally and activate the e2e tests that are otherwise skipped.
   
   ### TEST PLAN
   CI + new tests
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Changes UI
   - [ ] Requires DB Migration.
   - [ ] Confirm DB Migration upgrade and downgrade tested.
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro commented on a change in pull request #10324: feat: add optional prophet forecasting functionality to chart data api

Posted by GitBox <gi...@apache.org>.
villebro commented on a change in pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324#discussion_r455530912



##########
File path: superset/utils/pandas_postprocessing.py
##########
@@ -544,3 +563,105 @@ def contribution(
     if temporal_series is not None:
         contribution_df.insert(0, DTTM_ALIAS, temporal_series)
     return contribution_df
+
+
+def prophet(  # pylint: disable=too-many-arguments,too-many-locals

Review comment:
       These unfortunately need to be one single function with `too-many-arguments`, as they are called based on instructions provided in the chart data request. But you're right, `too-many-locals` doesn't need to be part of the design here, so will try to break out some internal logic.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro commented on pull request #10324: feat: add optional prophet forecasting functionality to chart data api

Posted by GitBox <gi...@apache.org>.
villebro commented on pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324#issuecomment-723840442


   Hi @enjineerMan this feature is available on master branch and on the forthcoming 0.38 release of Apache Superset. on the new ECharts based timeseries viz type. To enable, you need to make sure the `fbprophet` package is installed, after which the Predictive Analytics feature can be used in the "Time-series chart" visualization type. Please keep in mind that this functionality is still experimental and has known performance issues that are gradually being worked on.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro commented on a change in pull request #10324: feat: add optional prophet forecasting functionality to chart data api

Posted by GitBox <gi...@apache.org>.
villebro commented on a change in pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324#discussion_r455531027



##########
File path: tests/pandas_postprocessing_tests.py
##########
@@ -508,3 +509,119 @@ def test_contribution(self):
         self.assertListEqual(df.columns.tolist(), ["a", "b"])
         self.assertListEqual(series_to_list(column_df["a"]), [0.25, 0.75])
         self.assertListEqual(series_to_list(column_df["b"]), [0.1, 0.9])
+
+    def test_prophet_incorrect_values(self):

Review comment:
       Good idea, will break this up.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro commented on pull request #10324: feat: add optional prophet forecasting functionality to chart data api

Posted by GitBox <gi...@apache.org>.
villebro commented on pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324#issuecomment-660921855


   @willbarrett this is ready for re-review. I split out the training part into a private method to make the main method simpler (was able to drop the `disable too-many-locals`), and also made each unit test a separate function. However, the `too-many-arguments` can't really be changed due to the nature of how post processing operations are called from the chart data API.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro merged pull request #10324: feat: add optional prophet forecasting functionality to chart data api

Posted by GitBox <gi...@apache.org>.
villebro merged pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett commented on a change in pull request #10324: feat: add optional prophet forecasting functionality to chart data api

Posted by GitBox <gi...@apache.org>.
willbarrett commented on a change in pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324#discussion_r455386737



##########
File path: tests/pandas_postprocessing_tests.py
##########
@@ -508,3 +509,119 @@ def test_contribution(self):
         self.assertListEqual(df.columns.tolist(), ["a", "b"])
         self.assertListEqual(series_to_list(column_df["a"]), [0.25, 0.75])
         self.assertListEqual(series_to_list(column_df["b"]), [0.1, 0.9])
+
+    def test_prophet_incorrect_values(self):

Review comment:
       Breaking this up into a test for each incorrect value type may make debugging test failures easier down the road.

##########
File path: superset/utils/pandas_postprocessing.py
##########
@@ -544,3 +563,105 @@ def contribution(
     if temporal_series is not None:
         contribution_df.insert(0, DTTM_ALIAS, temporal_series)
     return contribution_df
+
+
+def prophet(  # pylint: disable=too-many-arguments,too-many-locals

Review comment:
       Could this function be broken up? Generally comments in a large function like this are a good place to draw divisions




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] enjineerMan commented on pull request #10324: feat: add optional prophet forecasting functionality to chart data api

Posted by GitBox <gi...@apache.org>.
enjineerMan commented on pull request #10324:
URL: https://github.com/apache/incubator-superset/pull/10324#issuecomment-723336284


   Hi, this might be a dumb question but where can we find this forecasting function in Superset? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org