You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/21 07:52:38 UTC

[GitHub] [airflow] kushagra391 opened a new issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

kushagra391 opened a new issue #17124:
URL: https://github.com/apache/airflow/issues/17124


   
   
   **Apache Airflow version**:
   2.1.1
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   1.19
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: AWS
   - **OS** (e.g. from /etc/os-release): ubuntu
   - **Kernel** (e.g. `uname -a`): Linux x86_64 GNU/Linux
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   Successful AwsBatchOperator are exiting with return code 1 instead of 0
   
   ```
   [2021-07-21 07:37:20,074] {operators.py:203} INFO - AWS Batch job (f0e1c462-239e-4e83-b88e-e85acfcdc3d8) succeeded
   [2021-07-21 07:37:20,134] {taskinstance.py:1204} INFO - Marking task as SUCCESS. dag_id=poc_aws_batch, task_id=poc_aws_batch_pass, execution_date=20210720T000000, start_date=20210721T073707, end_date=20210721T073720
   [2021-07-21 07:37:20,191] {local_task_job.py:151} INFO - Task exited with return code 1
   ```
   
   This is making callbacks like `on_success_callback()` not getting triggered.
   
   
   **What you expected to happen**:
   
   They should exit with return code 0 instead
   
   
   <!-- What do you think went wrong? -->
   
   Not sure
   
   **How to reproduce it**:
   
   Can be reproduced with a simple dag 
   
   ```
   with poc_emr_dag as dag:
       start = DummyOperator(task_id="start", retries=3)
       end = DummyOperator(task_id="end", retries=3)
   
       poc_aws_batch_pass = AwsBatchOperator(
           **build_job_params(task_id="poc_aws_batch_pass", fail_job=False)
       )
   
       start >> poc_aws_batch_pass >> end
   ```
   
   Logs are included in details section
   
   
   **Anything else we need to know**:
   
   This started with v2.x. Same code was working fine on v.1.10.12
   
   
   How often does this problem occur? Once? Every time etc?
   
   Everytime
   
   Any relevant logs to include? Put them here in side a detail tag:
   <details><summary>Click here for code snippet + logs</summary> 
   Tried with a simple DAG
   
   ```python
   default_args = dict(
       depends_on_past=False,
       start_date=DAG_START_DATE,
       end_date=DAG_END_DATE,
       email_on_failure=False,
       email_on_retry=False,
       retries=0,
       retry_delay=timedelta(minutes=5),
       schedule_interval="15 6 * * *",
   )
   
   poc_emr_dag = DAG(
       dag_id=DAG_NAME,
       default_args=default_args,
       description="Test DAG for trying out AwsBatchOperator",
       schedule_interval=timedelta(days=1),
       tags=["example"],
   )
   
   with poc_emr_dag as dag:
       start = DummyOperator(task_id="start")
       end = DummyOperator(task_id="end")
   
       poc_aws_batch_pass = AwsBatchOperator(
           **build_job_params(task_id="poc_aws_batch_pass", fail_job=True)
       )
   
       start >> poc_aws_batch_pass >> end
   ```
   
   Logs show 
   
   ```
   [2021-07-21 07:37:16,346] {batch_client.py:349} INFO - AWS Batch job (f0e1c462-239e-4e83-b88e-e85acfcdc3d8) check status (SUCCEEDED) in ['RUNNING', 'SUCCEEDED', 'FAILED']
   [2021-07-21 07:37:19,970] {batch_client.py:349} INFO - AWS Batch job (f0e1c462-239e-4e83-b88e-e85acfcdc3d8) check status (SUCCEEDED) in ['SUCCEEDED', 'FAILED']
   [2021-07-21 07:37:19,973] {batch_client.py:283} INFO - AWS Batch job (f0e1c462-239e-4e83-b88e-e85acfcdc3d8) has completed
   [2021-07-21 07:37:20,073] {batch_client.py:257} INFO - AWS batch job (f0e1c462-239e-4e83-b88e-e85acfcdc3d8) succeeded:  ....
   [2021-07-21 07:37:20,074] {operators.py:203} INFO - AWS Batch job (f0e1c462-239e-4e83-b88e-e85acfcdc3d8) succeeded
   [2021-07-21 07:37:20,134] {taskinstance.py:1204} INFO - Marking task as SUCCESS. dag_id=poc_aws_batch, task_id=poc_aws_batch_pass, execution_date=20210720T000000, start_date=20210721T073707, end_date=20210721T073720
   [2021-07-21 07:37:20,191] {local_task_job.py:151} INFO - Task exited with return code 1
   ```
   
   </details>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #17124:
URL: https://github.com/apache/airflow/issues/17124#issuecomment-884005562


   The task still seems to work though? It’s marked as success and that’s what’s important to Airflow. The `on_success_callback()` does not check the return code, only the task state, so if it does not run there is something else going on, not the return code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kushagra391 commented on issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
kushagra391 commented on issue #17124:
URL: https://github.com/apache/airflow/issues/17124#issuecomment-884052102


   thanks @uranusjr ! 
   yea, the job does get marked success. 
   And i just tested callbacks on a test dag. It seems to be working fine. 
   
   Closing the issue, I'll debug my existing code further.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17124:
URL: https://github.com/apache/airflow/issues/17124#issuecomment-884005562


   The task still seems to work though? It’s marked as success and that’s what’s important to Airflow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17124:
URL: https://github.com/apache/airflow/issues/17124#issuecomment-884164870


   Oh that makes sense, this is related to how multi-inheritance works in Python (method resolution order, aka MRO). Airflow 2 uses a different kind of inheritance magic (metaclass) so the MRO logic changes a bit. In general, you should put mixin classes ahead of the “proper” inheritance chain. Multi-inheritance is hard.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kushagra391 commented on issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
kushagra391 commented on issue #17124:
URL: https://github.com/apache/airflow/issues/17124#issuecomment-884070878


   found the issue, for some reason, this does not call callbacks on v2.x (but worked fine v1.x)
   
   ```
   class RWBatchOperator(AwsBatchOperator, AATBaseOperator):
       pass
   ```
   
   it works if I do 
   ```
   class RWBatchOperatorV2(AATBaseOperator, AwsBatchOperator):
       pass
   ```
   
   (AATBaseOperator has callbacks defined for on_success / on_failure)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kushagra391 closed issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
kushagra391 closed issue #17124:
URL: https://github.com/apache/airflow/issues/17124


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17124:
URL: https://github.com/apache/airflow/issues/17124#issuecomment-884301056


   > Multi-inheritance is hard.
   
   Cannot agree more!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #17124: AwsBatchOperator on v2.1.x exits with return code 1

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #17124:
URL: https://github.com/apache/airflow/issues/17124#issuecomment-883972671


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org