You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/27 11:16:18 UTC

[GitHub] [airflow] pedresnyman opened a new pull request #11883: Feature/emrstepsensor

pedresnyman opened a new pull request #11883:
URL: https://github.com/apache/airflow/pull/11883


   This is a test PR 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] baolsen commented on a change in pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#discussion_r526618650



##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       Thanks @feluelle . 
   
   We've got feedback from AWS on our case and they basically confirm our observations - their documentation isn't clear but the cluster can reach TERMINATED state even if steps fail, if those steps are set to CONTINUE. Effectively the step state is ignored completely in this case. It's still not clear what happens if only some steps are set to CONTINUE but nevermind ;)
   
   For implementation then we have some options:
   
   1. Modify the job flow sensor to optionally check the step states before completion and fail that task if any steps have failed.
   2. Modify the step sensor to optionally check all steps instead of just 1 (our current code). 
   
   Probably option 2 is more in line with the AWS implementation, but I feel it is unexpected to the user that a cluster can complete even with failed steps.... 
   
   Regardless I think we should add something to the EMR Sensor "how to" page to clarify once we decide.
   
   What do you think is the best route? Still option 2?
   
   Thanks for the help :)
   
   [Edit: I guess with option 1 then the job flow sensor needs to have configuration regarding the steps states which represent success / fail. In option 2 these states are already defined as part of the step sensor)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] baolsen commented on a change in pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#discussion_r517896190



##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       Hi @feluelle 
   Thanks for feedback. 
   
   We've tested with EmrJobFlowSensor but it isn't behaving as we'd like. The cluster moves in to TERMINATED state even if some steps fail, I guess because we are setting the steps with CONTINUE status. In our use case the steps are independent and we therefore want the others to run even if some do fail.
   
   I would have expected the cluster to move into TERMINATED_WITH_ERRORS but this isn't clearly documented by AWS at all. We've logged a support case to get clarity on it. Maybe it is a bug, but unlikely.... They will probably just say they will update the documentation.
   
   Once we have feedback we can decide how best to handle checking for the steps status.
   
   My thoughts:
   
   - I think a separate operator for this would be overkill. JobFowSensor should work for what we want to do, if we can get it to be sensitive to the step status as well.
   - We could add to the EmrJobFlowSensor to optionally also check the status of the steps before succeeding, and enable that check by default. I think it would be unexpected behavior that the operator is successful even if some steps weren't successful. 
   - We can XCom push the step status into their own XCom key before completing the execute, which also enables our use case (with our use case we need to do something with/about the failed steps so we need to know which ones they are).
   
   What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-717768909


   [The Workflow run](https://github.com/apache/airflow/actions/runs/333110145) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-717943840


   [The Workflow run](https://github.com/apache/airflow/actions/runs/333715165) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pedresnyman commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
pedresnyman commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-718419120


   Good day @mik-laj and @kaxil,
   
   Can one of you please review.
   
   Kind regards


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pedresnyman commented on a change in pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
pedresnyman commented on a change in pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#discussion_r517137237



##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       Hi @feluelle, 
    
   The EmrJobFlowSensor can work yes. Can we add a xcom_push to the EmrJobFlowSensor so that we have a list of all steps? So then we have all the steps and the step states.
   
   With that functionality added you can always rerun a failed step instead or re-running all of the steps.
   
   Regards




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pedresnyman closed pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
pedresnyman closed pull request #11883:
URL: https://github.com/apache/airflow/pull/11883


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-717769083


   [The Workflow run](https://github.com/apache/airflow/actions/runs/333139185) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-720984167






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-717812394


   [The Workflow run](https://github.com/apache/airflow/actions/runs/333288852) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#discussion_r517300737



##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.
+
     If it fails the sensor errors, failing the task.
 
     With the default target states, sensor waits step to be completed.
 
     :param job_flow_id: job_flow_id which contains the step check the state of
     :type job_flow_id: str
     :param step_id: step to check the state of
-    :type step_id: str
+    :type step_id: str or list

Review comment:
       ```suggestion
       :type step_id: Optional[str]
   ```

##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -50,16 +60,72 @@ def __init__(
         *,
         job_flow_id: str,
         step_id: str,

Review comment:
       ```suggestion
           step_id: Optional[str],
   ```

##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       > We've tested with EmrJobFlowSensor but it isn't behaving as we'd like. The cluster moves in to TERMINATED state even if some steps fail, I guess because we are setting the steps with CONTINUE status. In our use case the steps are independent and we therefore want the others to run even if some do fail.
   > 
   > I would have expected the cluster to move into TERMINATED_WITH_ERRORS but this isn't clearly documented by AWS at all. We've logged a support case to get clarity on it. Maybe it is a bug, but unlikely.... They will probably just say they will update the documentation.
   
   Fine :) That's what I thought - that you can differ between terminated state with succeeded tasks and failed tasks. If that's not the case we can do it like proposed in the PR.
   
   Regarding the seperation of the `EmrStepSensor` I've changed my mind. It makes sense to keep it in `EmrStepSensor` like we have the `ExternalTaskSensor` also for the DAG - not only for a specific task.
   
   > We could add to the EmrJobFlowSensor to optionally also check the status of the steps before succeeding, and enable that check by default. I think it would be unexpected behavior that the operator is successful even if some steps weren't successful.
   
   SGTM, too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] baolsen commented on a change in pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#discussion_r526618650



##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       Thanks @feluelle . 
   
   We've got feedback from AWS on our case and they basically confirm our observations - their documentation isn't clear but the cluster can reach TERMINATED state even if steps fail, if those steps are set to CONTINUE. Effectively the step state is ignored completely in this case. It's still not clear what happens if only some steps are set to CONTINUE but nevermind ;)
   
   For implementation then we have some options:
   
   1. Modify the job flow sensor to optionally check the step states before completion and fail that task if any steps have failed.
   2. Modify the step sensor to optionally check all steps instead of just 1 (our current code). 
   
   Probably option 2 is more in line with the AWS implementation, but I feel it is unexpected to the user that a cluster can complete even with failed steps.... 
   
   Regardless I think we should add something to the EMR Sensor "how to" page to clarify once we decide.
   
   What do you think is the best route? Still option 2?
   
   Thanks for the help :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-717196194


   [The Workflow run](https://github.com/apache/airflow/actions/runs/331248501) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-717943358


   [The Workflow run](https://github.com/apache/airflow/actions/runs/333714690) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#discussion_r516527637



##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       Doesn't it make sense to create a `EmrJobFlowSensor` or `EmrAllStepsSensor` instead?

##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       Actually a `EmrJobFlowSensor` already exists. Isn't this sufficient (to have the job flow state)?

##########
File path: airflow/providers/amazon/aws/sensors/emr_step.py
##########
@@ -21,24 +21,34 @@
 from airflow.providers.amazon.aws.sensors.emr_base import EmrBaseSensor
 from airflow.utils.decorators import apply_defaults
 
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_RETURN_KEY
+
 
 class EmrStepSensor(EmrBaseSensor):
     """
     Asks for the state of the step until it reaches any of the target states.
+    If no step Id has been provided the sensor retrieves all steps and current states
+    and will continue to ask until all steps reaches any of the target states.

Review comment:
       If not, `EmrAllStepsSensor` makes more sense to me, instead of adding this kind of functionality here (to the emr_step).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-717196531


   [The Workflow run](https://github.com/apache/airflow/actions/runs/331255461) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on pull request #11883: Feature/emrstepsensor

Posted by GitBox <gi...@apache.org>.
feluelle commented on pull request #11883:
URL: https://github.com/apache/airflow/pull/11883#issuecomment-726140042


   @pedresnyman Oh is there a reason you closed this or was this by accident?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org