You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/27 16:54:48 UTC

[GitHub] [airflow] mjpieters opened a new issue #11894: SmartSensor poke interval handling

mjpieters opened a new issue #11894:
URL: https://github.com/apache/airflow/issues/11894


   Currently, the smart sensor group DAG creates smart sensors for all shards with a fixed poke interval, 180 seconds.
   
   This should really be configurable.
   
   Moreover, the sensor should find some kind of compromise between running at a fixed rate and honouring the original `poke_interval` values of the sensor operators it replaces. Now _all_ sensors managed by the smart sensor dag will be poked every 180 seconds, regardless.
   
   This could be handled like a cron scheduler instead: poke all the sensors that would have been poked since the last time the smart sensor ran, based on their `poke_interval` and `exponential_backoff` values.
   
   E.g. record and cache the next poke execution time per managed sensor, and only poke those whose next execution time has passed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] YingboWang commented on issue #11894: SmartSensor poke interval handling

Posted by GitBox <gi...@apache.org>.
YingboWang commented on issue #11894:
URL: https://github.com/apache/airflow/issues/11894#issuecomment-718158686


   @mjpieters Thanks for creating this issue. We do have the plan to support customized poke_interval in smart sensor and will implement it. As for now, the smart sensor operator class has a default of poke_interval as 180 and it could be changed by adding `poke_interval=xxx` in the `smart_sensor_group.py` when creating the `SmartSensorOperator` task. By doing it all smart sensor tasks in an airflow cluster will poke more aggressively with the input `poke_interval`. We could also make this field configurable by airflow.cfg if needed. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] YingboWang commented on issue #11894: SmartSensor poke interval handling

Posted by GitBox <gi...@apache.org>.
YingboWang commented on issue #11894:
URL: https://github.com/apache/airflow/issues/11894#issuecomment-804149003


   I agree that this should be configurable as the poke time varies based on
   environment and sensor differences. I will create a PR to fix it.
   
   On Mon, Mar 22, 2021 at 8:17 AM Anthony Panat ***@***.***>
   wrote:
   
   > @YingboWang <https://github.com/YingboWang> can we also make the
   > poke_timeout in the SmartSensorOperator configurable by airflow.cfg as
   > well? It is currently set as 10 seconds by default in
   > smart_sensor_group.py but often times our sensors require more time than
   > this to run and as a result we are getting flooded with Airflow Process
   > timed out errors.
   >
   > —
   > You are receiving this because you were mentioned.
   >
   >
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/airflow/issues/11894#issuecomment-804143726>,
   > or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AAUQOMSQDFHKI6B24P2MOGLTE5NQNANCNFSM4TBDOAPQ>
   > .
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] anthonyp97 commented on issue #11894: SmartSensor poke interval handling

Posted by GitBox <gi...@apache.org>.
anthonyp97 commented on issue #11894:
URL: https://github.com/apache/airflow/issues/11894#issuecomment-804143726


   @YingboWang can we also make the `poke_timeout` in the `SmartSensorOperator` configurable by `airflow.cfg` as well? It is currently set as 10 seconds by default in `smart_sensor_group.py` but often times our sensors require more time than this to run and as a result we are getting flooded with Airflow Process timed out errors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #11894: SmartSensor poke interval handling

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #11894:
URL: https://github.com/apache/airflow/issues/11894#issuecomment-1026245243


   I'm closing this as probably we won't add new features to Smart Sensors since Deferrable Operators essentially supersede Smart Sensors. See docs: https://airflow.apache.org/docs/apache-airflow/stable/concepts/deferring.html#smart-sensors


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mjpieters commented on issue #11894: SmartSensor poke interval handling

Posted by GitBox <gi...@apache.org>.
mjpieters commented on issue #11894:
URL: https://github.com/apache/airflow/issues/11894#issuecomment-719114086


   I suppose I could use the [`policy()` hook](https://airflow.apache.org/docs/stable/concepts.html#mutate-tasks-after-dag-loaded) to alter the sensor poke interval. I don't think asking people to alter the shipped Airflow code directly is a good idea, at any rate.
   
   As for honouring managed sensors' poke intervals, I was thinking along the lines of the [standard library `sched` module implementation](https://github.com/python/cpython/blob/3.9/Lib/sched.py), so a heapqueue (but using a `time.monotonic()` rather than `time.time()` to avoid running into DST issues).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal closed issue #11894: SmartSensor poke interval handling

Posted by GitBox <gi...@apache.org>.
eladkal closed issue #11894:
URL: https://github.com/apache/airflow/issues/11894


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org